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SECTION I 


SUMMARY 

This report presents the results of a one year study performed by 
ALPHATECH, Inc. to develop a design methodology for the robust detection, 
isolation and accommodation (DIA) of sensor failures in jet engine control 
systems. This study was funded by the NASA Lewis Research Center due to 
concerns about robustness problems encountered with previous [3] jet engine 
sensor DIA algorithms. The purpose of this project was to address the funda- 
mental issues of robustness and redundancy in dynamic systems in a quant- 
itative way and develop a framework for designing DIA systems which explicitly 
tolerate unavoidable modeling errors. The scope of this project during this 
year has been limited to linear systems, although extensions to specific 
problems encountered in nonlinear engine models have been outlined. The 
results of this project as summarized at the end of this section, are believed 
to represent substantial contributions to the state of the art in failure 
detection and jet engine control. 

This project was organized into the three separate tasks described below. 

Task 1. Conduct basic research to develop a methodology for de- 
signing a system to detect, and distinguish among, a set 
of failure modes in the presence of model uncertainties; 

Task 2. Demonstrate the design methodology via application to an 
F-100 engine sensor FDIA problem at a single operating 
point ; 

Task 3. Evaluate alternative FDIA approaches proposed for the 

F-100 engine problem, and, in particular, the FDIA algo- 
rithm of NAS 3-22481 [3]. 
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In recent years a wide variety of techniques has been proposed for the 
detection, isolation and accommodation of failures in dynamic systems (see, 
for example the surveys in [ 1 ] , [ 2 ] and the numerous methods discussed in [3]- 
[10]). Some of these methods have been developed starting from general, ab- 
stract dynamic models, while others have been produced in the context of part- 
icular applications. While the general methods provide the basis for what in 
principle should be a widely applicable failure detection methodology, their 
very generality often tends to obscure the important concepts that must be 
considered in the design of practical and reliable failure detection systems. 
Conversely, the application-specific methods, which do address these basic 
concepts, typically offer little insight into how to generalize their re- 
sults. 

As a result, there has not been a satisfactory general design methodology 
for robust failure detection algorithms. In particular, the general ap- 
proaches to failure detection described in [ 1 ] — [ 3 ] take as their starting 
point mathematical models of both the system under consideration and of the 
types of failures that may occur. However, if one attempts (as, for example, 
was done in [3]) to use one of these approaches in a top-down or "canned” 
manner in which one generates the requisite overall models and then essen- 
tially plugs them into the approach chosen, the likely result will be a 
failure detection algorithm that does not work satisfactorily. The general 
reason for this is accurately stated in the request for proposals which re- 
sulted in this study: "A fundamental limitation of the performance of this 

(referring to our reference [3]) and all similar analytically redundant 
schemes is the adequacy of the model used to establish the reference upon 
which the detection/isolation decision is based and upon which successful 
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accommodation may depend." This statement raises key questions. How does one 
determine what aspects of a model are most important in the context of failure 
detection and isolation? How does one design DIA systems based on such knowl- 
edge? How does one measure performance in a way which not only is meaningful 
but also is analytical enough to provide a useful tool for comparing ap- 
proaches, pinpointing critical weaknesses and suggesting alternatives for 
improving performance? 

The philosophy of this program extends from the work reported in [4] and 
[5]. From these studies, three key concepts have been identified. First, all 
failure detection, isolation, and accommodation (FDIA) methods are based, 
either implicitly or explicitly, on the use of redundancy relations among the 
measured system variables. Consequently, the robustness of the FDIA process 
depends on the reliability of such redundancy relations, given the inevitable 
presence of model uncertainties and noise. 

Secondly, the presence of redundancy does not guarantee that all failures 
of interest can be detected and distinguished. Clearly, only observable fail- 
ures are detectable and distinguishability of failure modes is only possible 
when' the effects of each failure are distinct. 

Finally, information about the presence or absence of particular failures 
tends to accumulate over time. Failure decisions which make use of multiple 
observations of the measured system variables will have lower error probabil- 
ities than "single shot" decisions if the proper use of failure information is 
made. 

These concepts have been examined in detail and have resulted in several 
interesting solutions to the tasks described above. Our design methodology 
(Task 1) consists of a set of analytical techniques for evaluating various 
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FDIA designs and a variety of procedures for optimizing and modifying indiv- 
idual components of that design. The evaluation process is based on the 
notion of probabilistic distance metrics. By including modeling error in the 
stochastic description of system behavior, we can then interpret these metrics 
in terms of realizable FDIA performance characteristics. These metrics are 
also used to optimize the robustness of redundancy relations for a particular 
range of modeling errors. Since failure coverage and distinguishability is of 
concern, particular attention is paid to those metrics which relate to de- 
tectability and distinguishability of failure hypotheses. 

The application of this design methodology to the F-100 engine (Task 2) 
has resulted in an interesting structure for practical FDIA schemes. This 
structure consists of two levels and avoids many of the computational problems 
associated with the so-called ''optimal" methods, [1], [2]. At the top level, 
a monitoring system examines signals for the possible presence of any failure. 
A separate test for each failure is used to minimize the decision delay fol- 
lowing a failure. Alarms are raised at this level and trigger sequential 
testing procedures which: 1) compare all pairs of failure hypotheses which 

are potentially ambiguous following a given set of trigger alarms, and 
2) verify that the alarm was not a false trigger. The hypothesis tests make 
use of the residual signals which are derived from the optimization problems 
in our robust FDIA design methodology. The "parameters” of these tests are 
determined through a trade-off analysis which makes use of the metrics we have 
developed. Final failure decisions are then made on the basis of these in- 
dividual test results. 
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Finally, in Task 3, we demonstrate the generality of the distance metric 
evaluation tool by defining methods of evaluation for the FDIA scheme of ref- 
erence [3]. These schemes are then applied to a similar system and some po- 
tential methods of improvement are suggested. These improvements attempt to 
maintain much of the structure of the original algorithm including the use of 
Kalman filters for generating residuals. We attempt to identify those parts 
of the system model which most degrade performance and try to remove that seg- 
ment from the filtering process. 

The remainder of this report is organized as follows. Section 2 intro- 
duces some of the fundamental concepts and problems of failure detection and 
isolation. Section 3 provides an overview of an approach to FDI design which 
addresses the fundamental problems of Section 2 and requires analytical tools 
such as those described in Section 4 and Appendix A. In Section 5, we discuss 
several specific analytical FDI design procedures which make use of the re- 
sults in Section 4. An example of a complete sensor FDI system for the F-100 
engine at a single operating point is also provided and simulation results are 
presented. We also apply our results to the evaluation of a Kalman filter 
based FDI algorithm. Finally, Section 6 summarizes the contributions of this 
project and provides recommendations for future work. Appendix B provides an 
extension of our analytic results to the robust accommodation problem and 
Appendix C documents the design and simulation software which was generated 
for this project. 
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SECTION 2 


INTRODUCTION, MOTIVATION AND BACKGROUND 

In order to motivate the analytical results discussed in Section 4 and 
Appendix A, we present below several pertinent questions and some discussion 
of each. These questions form a logical sequence, starting from what at first 
may seem to be the simplest and most transparent query: 

• What is failure detection, isolation, and accommodation (FDIA)? 

Obviously, FDIA deals with the problem of detecting deviations from 
normal behavior in a specified components (sensor or effector), isolating the 
particular component which has "failed”, and initiating the appropriate 
adjustment to minimize the effect of the failure. The key point in this 
sentence is that in order to detect, isolate, and accommodate deviations, one 
requires a specification of what "normal behavior" is and of what a "devia- 
tion" is (i.e., models of the system and of the size and perhaps the nature of 
deviations to be detected). Furthermore, for each type of anomaly to be de- 
tected, the model must provide sufficient redundant information to allow one 
to detect the anomaly and to distinguish this anomaly from others. For 
example, in a triplex sensor system, in which there are three identical 
sensors of each type, one can perform voting by examining each triple to 
determine if they are consistent (i.e., normal). Here the model information 
used is that the three sensors measure the identical quantity, and the model 
of a deviation can be specified in several ways, such as in terms of 
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manufacturer's instrument specifications. As a second example, consider a 
relatively simple and often-used check, in which successive samples of the 
output of a particular sensor are examined to determine if there is an obvious 
inconsistency. Here the model information used is a crude measure of the 
bandwidth of the variable being sensed. Finally, consider a simple system 
involving linear motion and in which one has a velocity sensor and an accel- 
erometer. Here the kinematic model v = a provides a mechanism for obtaining 
one redundant relationship between these sensors. 

In the terminology used by Chow and Willsky [4], [7], the three examples 
just described are illustrations of direct (or hardware) redundancy, temporal 
(or self-test) redundancy, and analytic (or functional) redundancy, respec- 
tively. While there are clear differences among them, it is their 
similarities — in terms of being based on models and, more explicitly, on 
redundancy imbedded in those models — that we wish to stress. This permits us 
to construct a unified framework in which to examine and compose different 
approaches to failure detection and their robustness properties. 

• What does an FDIA algorithm do? 

Here again we follow Chow and Willsky. Roughly speaking, all failure de- 
tection systems can be described in terms of the conceptual block diagram of 
Fig. 2-1. There are three basic parts of the failure detection process. The 
first part, termed "residual generation" in the figure, uses the model that 
has been specified to generate "residual" signals which nominally should be 
near 2 ero and which will deviate from zero in characteristic ways when part- 
icular failures occur. The way in which residuals are generated differs 
markedly from method to method. For example, in a triplex system, if yi(k). 
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Figure 2-1. Three Stage Structure of the FDI Process 

y 2 (k), an d y 3 (k) denote the outputs of three identical sensors, then 
r l(k) = Yl(k)-y 2 (k) and r 2 (k) = y 2 (k)-y 3 (k) can be thought of as the residuals 
used in a voting system. In more complex methods, such as the Generalized 
Likelihood Ratio (GLR) method [1], [2], [7], [9], Kalman filters are used to 
generate the residuals, while in other advanced methods including the de- 
tection filter approach of Beard [10] and Jones [18], Kalman-like filters are 
designed, but with gains chosen in particular ways so as to make particular 
failures more readily apparent. 

The remainder of the FDIA system is the decision mechanism, which con- 
sists of the information collection and decision rule stages. This mechanism 
involves the examination of the residuals to detect, isolate, and accommodate 
failures. Depending upon the residual generation procedure, this process may 
take on quite different forms. In the above triplex system example, a simple 
rule of the form 

ri(k) small, r 2 (k) small No failure 

r^(k) small, r 2 (k) large y 3 failed 

ri(k) large, r 2 (k) small 

ri(k) large, r 2 (k) large 
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yi failed 
y2 failed 






can be used. Similarly, in the detection filter approach, the vector of 
residuals is constructed in a way so that relatively simple rules somewhat 
similar to voting can be used. For the Kalman filter-based methods, such as 
GLR, the effect of particular failures on the residuals can be calculated, but 
in general the nature of these effects is not as simple as in the other 
methods just discussed. Rather, what is required in the information, collec- 
tion and decision rule stages is a very indirect detection test such as the 
WSSR method used in [3], which must be followed or replaced by a more complex 
isolation process such as the GLR procedure of correlating the residuals with 
failure signatures. 

It is here that we begin to see an important distinction. The voting and 
detection filter approaches involve relatively simple decision mechanisms. 

This is because they and several other approaches to FDIA (such as the "param- 
eter synthesis" approach in [3]) make explicit use of specific redundant rela- 
tionships in the residual generation procedure . On the other hand, GLR and 
other Kalman filter-based methods make only implicit use of such relationships 
in terms, for example, of the resulting failure signatures. However, the 
advantage of an implicit approach over the explicit methods is that a Kalman 
filter-based approach represents a statistically optimal method for extracting 
and using information embedded in the residuals . 

• Why is robustness an issue? 

The answer to this question is obvious. As we have argued, all failure 
detection methods are based on the utilization — explicitly or implicitly — of 
dynamic models and, more directly on the redundancy relationships such models 
imply. If the model is in error, the redundancy relationships will also be in 
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error, and consequently, the behavior of residuals will deviate from their 
ideal characteristics. For example, in a triplex system, it will never be the 
case that the three sensors are absolutely identical. As an example consider 
a problem in which one has three accelerometers or gyros. In this case, scale 
factor errors and misalignment angles, to mention only two possibilities, 
introduce errors into the assumption that the instruments are measuring the 
same variable. In the case of highly complex nonlinear systems such as a jet 
engine, one uses a linear model to describe behavior at a single operating 
point. Thus, clearly, errors will exist due to the linearization error as 
well as the uncertainty in predicting the operating point. Furthermore, these 
errors vary over the operational envelope of the system as well as during 
transient operation. 

Intuitively, a residual generation procedure attempts to remove the pre- 
dictable part of sensor outputs and to produce signals whose behavior under 
ideal conditions is unaffected by the value of the variables being sensed. 

For example, the signal r(k) = yi(k)-y 2 (k), where yi and y2 are identical 
sensors, should only deviate from zero due to sensor noise or to a failure in 
one of the two sensors. However, if there is a scale factor difference be- 
tween the two, there is now another error source whose magnitude is modulated 
by the variable being sensed. In the case of a Kalman filter, the vector of 
residuals r(k) is ideally a white noise sequence uncorrelated with previous 
measurements. However, if there are any model uncertainties r(k) will in 
general have nonzero mean and a relatively complex correlation structure. 

The preceding paragraph suggests one important point and leads directly 
to another. The first point is an apparent advantage of techniques such as 
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voting and parameter synthesis, which make explicit use of redundancy rela- 
tions. In such a case it is relatively straightforward to deduce the effect 
of modeling errors on the residuals, and consequently one can use this infor- 
mation to assess the relative merits of different redundancy relationships and 
can determine how to design decision rules based on those that are deemed to 
be "robust enough.” On the other hand, methods such as those based on Kalman 
filters (and to a lesser extent detection filters) make only implicit use of 
redundancy relations, and thus it is apparently far more difficult to assess 
how uncertainties in these relations affect performance. This is precisely 
where one finds the difficulty with "top-down” approaches to FDIA algorithm 
design such as were used in [3]. In particular, in a given system one gener- 
ally has several sources of redundancy which are of different quality or cer- 
tainty. A top-down approach using an implicit method (such as GLR, WSSR, or 
the bank of observers approach) and a full system model in effect mixes to- 
gether redundancy relations that are accurate with ones that are quite uncer- 
tain, resulting in either severe sensitivity problems or severe limitations of 
detectable failures. A more sensible approach is the separate use of these 
relations so that one can take optimum advantage of each. Thus, one is led to 
the concept of identifying the most reliable redundancy relations and design- 
ing optimum algorithms based on these. These ideas form the basis of our work 
reported here and in [12] and [13]. 
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SECTION 3 


OVERVIEW OF THE DESIGN APPROACH 

In this section, we present a brief overview of the major concepts behind 
our approach to designing robust FDI algorithms. 

The first part of our design methodology is to identify those portions of 
the system dynamics (called parity relations) that are known with the most 
certainty, as the use of residuals based on these relations will be of great 
value in minimizing false alarms. Thus the first problem we wish to address 
is that of defining some "robustness metric” that quantifies how close to zero 
each residual is under normal conditions given the presence of model errors 
and noise. We can then develop a rcnk ordering of parity relations in terms 
of robustness and address the next problem which is coverage. 

Although solution of the first problem, in principle, provides a set of 
robust parity relations, there is no guarantee that the failure modes which 
must be identified are distinguishable from normal operation (i.e., that all 
failure modes are covered) . Thus the second problem is to define a metric to 
assess the ability of the relations Identified in the first step to detect a 
specified set of failure modes. Each failure mode is also modeled with an 
allowance for errors so that the effect of uncertainty associated with the 
failure model will be minimized. As a result, the initial set of relations 
may be augmented and residuals which are useful in detecting particular fail- 
ure modes not well-covered by the initial group are generated. Let us note 
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two aspects of this problem. The first is that it requires a second metric 
that measures the ability of a particular relation to distinguish between 
normal conditions and a particular failure mode (i.e., to give decidely larger 
values under the particular failure than under normal conditions) given the 
presence of model error and noise. Again this metric can be used to rank- 
order relations with respect to their usefulness for detecting particular 
failures. The second point is that the relations added at this stage are less 
reliable than the first ones obtained, in that they may have larger values 
when no failure has occurred. They may be needed, however, to achieve the 
desired coverage (i.e., to achieve a specified probability of detection for 
all failures). Note also that the metric used should provide a guide to the 
minimum magnitude of a particular failure that can be reliably detected (i.e., 
to achieve a specified probability of false alarm) . 

The final step in the design is concerned with the problem of distin- 
gui8hability: given that a failure has occurred, can we determine which 

failure mode has occurred. Here again we need a metric that measures the 
ability of a parity relation to distinguish a particular failure mode from an 
alternative set of possible events corresponding to one or more of the other 
failure modes. Note again that if additional relations are needed, they will 
be inherently less reliable under normal conditions. However, at this stage 
we can, if necessary, avoid the impact such relations would have on false 
alarm rate by using a two-level structure. Specifically, the relations deter- 
mined in the first two stages are sufficient for detection of all failures, 
but may not be able to Isolate all of them. Thus, one could use these rela- 
tions to detect and trigger the use of additional relations for isolation 
only. Fig. 3-1 gives a complete picture of the FDI system design process. 
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Figure 3-1. Overview of the FDI System Design Process 













For the most part we have discussed the residual generation phase of the 
FDI system. However, the concept of metric-based evaluations with a proba- 
bilistic description of model error is also of value in designing the inform- 
ation collection phase, i.e., in determining the data length required to 
achieve desired performance levels and in specifying the details of how suc- 
cessive residuals are accumulated. These issues are discussed in more detail 
in Section 5. 

In the next section, we describe the metrics we have considered and indi- 
cate how they are used in the design of robust FDI systems, and the evaluation 
of alternative FDI schemes. 
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SECTION 4 


ANALYTICAL RESULTS 

In this section we develop the analytical basis for the design and analy- 
sis of FDI systems. We start with a definition of redundancy and pose several 
optimization problems which, when solved, provide relationships among measured 
variables which can be used in the FDI process. These optimization problems 
are then interpreted as special cases of statistical discrimination. Various 
distinguishability "metrics” are discussed which provide the analytical basis 
for FDI performance evaluation. These metrics also provide an alternative 
mechanism for generating redundancy relationships which are "optimal” in the 
sense of the performance metric which is used. 

PROBLEM FORMULATION 

Consider a linear, discrete time, time-invariant dynamic system with 
uncertainty characterized by a finite set of system parameters viz., 

x(k+l) = Ajj, x(k) + Bji u(k) + E % w£(k) (4-1) 

y(k) = C* x(k) + D* u(k) + vjt(k) (4-2) 

where 

x(k) « NS - dimensional state vector at time k, 
y(k) = NO - dimensional measurement vector, 
u(k) = NC - dimensional measurable control vector, 
w(k) and v(k) are process and measurement noises respectively and are assumed 
to be zero-mean, white, and Gaussian with covariance matrices Q& and Rjj, 
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respectively, and where 2 = 1,2,...,L with the a priori probability of the 2th 
model being correct denoted by p^. (Note that it is indeed possible to form- 
ulate this problem using a continuum of parameter variations. Similar results 
may then be derived) . 

A redundancy relationship is now defined as a linear combination of mea- 
surements and controls over a finite window of observation. Specifically, if 

we let Y T (k) - (y T (k) , y T (k+l) , . . .y T (k+p) ), and ^(k) = (u T (k), u T (k+l) , . . .u T - 
P P 

(k+p) ), then redundancy relations take the form 


_v(k) = 


Yp(k) 

U p (k) 


W y T Y p (k) + W u Tu p (k) 


(4-3) 


where v(k) is the t-dimensional residual vector which, under ideal circum- 
stances (no noise or modeling error) is identically zero. The matrix W is 
sometimes referred to as the parity check matrix. Next, we can expand Yp(k) 
in terms of the parameters of the 2th system model as 

C2 


Djj. 0 . . . 0 

C$2 D i 


Y p (k) - 


C 2 A 2 


x(k) + 


U p (k) 


C$2 P 


c 2 a 2 P _ 1 b 2 * • * C $2 °2 


0 ... 0 


w(k) 

C$2 0 


• 

• • 


• 

* 

• 


• 

CflAjjP^E 2 • • .C$2 0 


w(k+p) 



(4-4) 


v(k+p) 
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or 

Y p(k) = M pi x(k) + N p * U p (k) + IIpjj W p (k) + V p (k) (4-5) 

Thus under the Jtth model hypothesis Eq. 4-3 can now be written as, 


v(k) 



Np i 


x(k) 

+ 




U p (k) 


0 

I 


— — 


— — 


W p (k) + 



(4-6) 


Now, consider the simplest case when no modeling error or noise is 
present. We can make _u(k) identically zero by choosing W as an orthogonal 
basis for the left null-space of the matrix. 


M 


P 


A 



(4-7) 


That is, we find all the vectors for which wTM p = 0 and form the parity check 
matrix using these vectors for its rows. 

Comments 

1. The number of independent parity checks for any p is NO(p+l)-NS. 

2. As discussed in [19], one need only look at values of p=0,...,NS 
to find all of the independent parity checks. 

3. The solution for W can also be obtained by finding the vectors 
which satisfy Wy T M p =0, and then solving W y T N p + W U T = 0 where 
WT = (W y T, W U T). 

Table 4-1 presents the details of the linear reduced-order F-100 engine 
model used to demonstrate the analytical results in this project. This model 
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TABLE 4-1. DETAILS OF F-100 ENGINE MODEL 


Model 

8. 7763920E-01 

9. 544804 7E-02 

A <NS,NS) -- 

-7 . 6181483E-03 

3.2125510E-02 

2.3492503E-03 

9.3342350E-01 

2.9041661E-02 

-8.4782876E-02 

3.8686348E-04 

-1.4044064E-04 

9.8545470E-01 

1.1503949E-04 

3.92U614E-03 

-9.3381410E-04 

-7.8085437E-04 

9.5632940E-01 

4 . 4787161E-02 

6.4114124E-02 

B (NS.NC) -- 

-1 . 5856765E-02 

2.9632282E-03 

2 . 1774249E-02 

1.4993310E-02 

-1.2106993E-03 

-2.1525344E-03 

1.4233845E-03 

9.6010260E-04 

-7.8124125E-05 

-2.6199684E-06 

1 . 7201H0E-03 

5.1 6288 16E-04 

2.1082116E-04 

-1. 66006 19E-05 

1 . OOOOOOOE+OO 

O.OOOOOOOE+OO 

C (NO, NS) -- 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

-O.OOOOOOOE+OO 

1 .OOOOOOOE+OO 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

-3.6199179E-02 

8.6627400E-01 

-1.7538881E-02 

-1.4256181E-02 

2 . 3806910E-01 

4.1814116E-03 

-2.3846535E-02 

-2.1576596E-02 

-1.3806290E+00 

-6.4474080E-01 

-2.1066390E-01 

2.4134478E-02 

O.OOOOOOOE+OO 

0.0000000E-00 

D (NO.NC) — 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

2 . 2995210E-01 

-4.2003830E-01 

2.9931962E-02 

4.6308138E-03 

l . 1445380E-01 

-5.3470520E-01 

4.3202233B-02 

1.8882763E-04 

9.3831810E-01 

5.8143690E-01 

-1.8835608E-02 

-3.4454153E-03 


Model Uncertainty (for Robust Redundancy) 


DEL A (NS, NS) 


1 . 5380000E-02 
3, 1333333E-03 
1.7636009E-03 
4.4277371E-03 


2.1089999E-02 

9.1000004E-03 

1.1241082E-03 

2.8102705E-03 


1.40U107E-03 

8.7536051E-04 

2.4200000E-03 

2.7039999E-03 


DEL B (NS.NC) — 


3rM95694E-03 

2.4867395E-03 

8.5665914E-04 

2.1998493E-03 


1.6469174E-03 

4.5617996E-04 

1.7833420E-04 

4.4917196E-04 


5. 1843788E-04 
1.5556102E-04 
5.5528384E-05 
1.3978184E-04 


DEL C (NO, NS) 


O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

6.8665206E-02 

3.6485530E-02 

1.1721950E-01 


O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

1.2315690E-01 

1.9306340E+00 

9.3263514E-02 


O.OOOOOOOE+OO 
O.OOOOOOOE+OO 
3.72894 J5E-03 
1.9171619E-03 
9.9560004E-03 


DEL D (NO.NC) 


O.OOOOOOOE+OO 
O.OOOOOOOE+OO 
1 . 1 518222E-02 
6 . 5602008E-03 
6.2302962E-02 


O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

3.2224100E-02 

2.4830708E-02 

4.0404614E-02 


O.OOOOOOOE+OO 
O.OOOOOOOE+OO 
6.0293591E-03 
3.7279800E-03 
8. 7860543E-03 


1.2169647E-03 

1.9486381E-04 

1.1208000E-03 

7.2799991E-03 


2.2065216E-04 
8. 5836473E-05 
1.6012165E-05 
3.9906063E-05 


O.OOOOOOOE+OO 
O.OOOOOOOE+OO 
4.6490412E-03 
3.1758063E-03 
1 . 3030000E-02 


O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

3.0286349E-03 

2.7165335E-04 

3.3902435E-03 


-3.6747955E-02 
-1.2677516E-02 
9.1238425E-04 
3. 7244798E-03 


O.OOOOOOOE+OO 
O.OOOOOOOE+OO 
-7 . 1547480E-01 
1.2127620E+00 
3.4962920E-01 


8.6416323E-03 
2.6413030E-03 
8.5322355E-04 
2. 1532946E-03 


O.OOOOOOOE+OO 

O.OOOOOOOE+OO 

1.6963300E-01 

4.1060939E-02 

2.8485480E-01 


19 




was developed in [11] along with an estimate of the uncertainty of each matrix 
element. This uncertainty information will be used to generate subsequent 
results. The nominal operating point corresponds to the maximum power lever 
angle (PLA = 83 degrees) and the ambient conditions associated with flight at 
the sea-level-static Mach (M) and altitude (h) condition; M = 0.6 and 
h = 10,000 feet. The sampling interval is At = .02 seconds. The states, out- 
puts and controls represent perturbations from the nominal values. The engine 
variables (defined in [11]) corresponding to these perturbations are, 

x = [Nl, N2, T t4 , T t4 . 5 ] 
u = [WF, AJ, FGV, SVA, BLC] 
y = [HI, N2, P t 4» P t 6» FTIT 1 

Parity checks, W, generated from the left null space of M p (with A, B, C, 
D defined in Table 4-1) for p=0 and 1 are shown in Table 4-2. The first line 
in each parity check corresponds to the coefficients which multiply the 
perturbations y(k-p) and u(k-p), the next line multiplies y(k-p+l) and 
u(k-p+l) and so on up to y(k) and u(k) . 

ROBUST REDUNDANCY 

The above results are easily extended for the case of model uncertainty. 
Since the general residuals formed by W^Mpfc cannot be zero for every model we 
pose the problem, 

min J « E* { | |w T M p 1 2 } (4-8) 

subject to W^W ■ I 
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TABLE 4-2. PARITY CHECK FOR F-lOO ENGINE 


LEFT NULL SPACE OF M p 


p 

N1 

Scaled Sensor Coefficients 

FT IT 

UF 

Scaled Control Coefficients 

1 pn 

N2 

PT4 

PT6 

AJ 

FGV 

warn 

1 

0.0927 

0.1180 

-0.1272 

-0.1272 

-0.3316 

0.0352 

0.0352 

0.0000 

0.0008 

0.3259 


-0.0688 

-0.2678 

0.3590 

-0.0089 

0.3345 

-0.4048 

-0.0543 

-0.0039 

-0.0005 

0.1472 

1 

0.3068 

0.0997 

-0.2785 

-0.4249 

0.2423 

-0.1275 

-0.4983 

0.0338 

0.0025 

0.2402 


0.0428 

0.2579 

-0.0216 

-0.2374 

-0.1069 

0.1324 

-0.0739 

0.0089 

-0.0002 

0.3099 

0 

-0.0863 

0.4048 

-0.4662 

0.3119 

0.0035 

0.0682 

-0.0311 

0.0005 

0.0021 

-0.7131 

1 

0.4017 

0.1319 

-0.0024 

-0.0017 

-0.1340 

0.1600 

0.1213 

-0.0132 

0.0012 

0.0205 


-0.4320 

0.3419 

-0.4505 

-0.0248 

0.1783 

-0.0609 

-0.3061 

0.0179 

0.0027 

-0.3546 

1 

0.4333 

0.0721 

0.2728 

0.0732 

0.1347 

-0.1800 

0.0971 

-0.0136 

-0.0004 

0.0460 


-0.5512 

-0.4604 

0.1453 

-0.0196 

-0.1964 

0.1532 

0.1647 

-0.0072 

-0.0013 

0.1964 

1 

0.0359 

0.1126 

-0.1272 

0.3139 

-0.0681 

0.0685 

-0.1697 

-0.0149 

0.0009 

-0.4570 


0.0145 

0.0362 

0.0055 

-0.4717 

0.1017 

-0.0427 

-0.3090 

0.0221 

0.0004 

0.5405 

1 

-0.0303 

-0.4924 

0.5286 

-0.0539 

-0.0447 

-0.0736 

0.2191 

-0.0143 

-0.0026 

0.4593 


0.1221 

0.1359 

-0.1139 

-0.2586 

0.0465 

0.0122 

-0.2131 

0.0155 

0.0007 

0.2159 


The constraint In Eq. 4—8 insures that WW. The solution for W now involves 
an eigenvalue decomposition of the matrix C 0 = E* {M p £ M p £ T }. That is, each 
row of W satisfies, 

T T 

Wi C 0 = Xm (4-9) 

t 

The optimal value for J is J* = ][ so that one easily sees that the t-best 

i=l 

redundancy relationships are the t eigenvectors of C 0 corresponding to the t 
smallest eigenvalues. This approach is referred to as the Robust Redundancy 
Null Space Approach. 
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Now, in contrast to the previous formulation which did not include un- 
certainty, there are no longer a finite number of independent parity checks; p 
may take on any value. One would expect, however, that W would become ill- 
conditioned (as p gets large) if only the best parity checks for each p are 
used. That is, as p grows past the maximum for perfectly known systems (Lou's 
result [19]), some parity checks may be nearly dependent resulting in a parity 
check matrix which is nearly singular. The number of parity checks t and the 
"order” - p are design parameters which need to be chosen in the design of the 
decision process. Typically, p is chosen only up to NS in keeping with Lou's 
result [19]. 

Table 4-3 shows the 10 best parity checks of order p = 0, 1 for the F-100 
engine. Each metric, Xf, in the table can be no larger than 5.62 for p = 0 

and 8.60 for p = 1. (The largest eigenvalues of C 0 for each p) . A close 

examination of Table 4-1 points out an important fact discussed in the pre- 
vious section. That is, a set of redundancy relations does not guarantee 
coverage of all failure modes of interest. This can be seen as follows. 

The sensitivity of each parity check to a sensor bias failure can be 
derived from the sum of the coefficients which multiply that particular 
measurement. That is, a bias of size bf in sensor i results in a bias in the 

residual v of size bilei^ . . .ei T ]wy where e^ is a unit vector in the 

"direction" of sensor i, and Wy is the parity check corresponding to v. 
Referring to Table 4-3 we may conclude that many of these parity checks are 
not particularly sensitive to sensor #1 (Nl) failures (e.g., for the first 
parity check the Nl-failure-sensitivity is -.5105 + .6851 = .0946. The tenth 
parity check has Nl-failure-sensitivity of -.1049). The usefullness of any 
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TABLE 4-3. PARITY CHECKS FOR F-100 ENGINE 
SMALLEST EIGENVALUES Of Co 
(Robust Redundancy - Null Space Approach) 


HO CONSTRAINTS 


' — 



Scaled Sensor Coefficients 


Scaled Control Coefficients 


p 

METRIC 

N1 

N2 

PT4 

PT6 

FT IT 

WF 

AJ 

FCV 

SVA 

8LC 

1 

2.604E-04 

-0.5905 

0.6851 

-0.3240 

0.2676 

0.0044 

0.0108 

-0.0013 

0.0038 

0.0031 

0.0056 

-0.0408 

-0.0084 

-0.0481 

0.0033 

0.0109 

-0.0003 

-0.0014 

0.0000 

0.0307 

0.0010 

1 

3.I55E-03 

0.2435 

-0.2408 

-0.4618 

0.5366 

0.0676 

-0.0880 

-0.2346 

0.2476 

0.0156 

0.0049 

-0.0029 

-0.0127 

-0.0974 

0.0906 

0.0064 

-0.0085 

0.0010 

0.0004 

0.3236 

-0.3603 

l 

6.576E-03 

-0.1797 

0.1427 

0.3313 

-0.4372 

-0.0350 

0.0634 

-0.3326 

0.3587 

-0.0340 

-0.0105 

0.0806 

-0.0444 

-0.1784 

0.2230 

0.0180 

-0.0177 

-0.0019 

-0.0003 

0.3938 

-0.3936 

1 

2.087E-02 

0.0767 

-0.0129 

-0.0111 

0.1735 

-0.5885 

0.5089 

-0.1099 

0.1145 

-0.0661 

0.1279 

0.2130 

-0.2604 

-0.2693 

0.2043 

0.0169 

-0.0157 

0.0039 

-0.0017 

-0.2395 

0.1648 

0 

2.I47E-02 

-0.0856 

-0.5930 

0.6019 

-0.0041 

-0.0674 

-0.0703 

0.2972 

-0.0172 

-0.0029 

0.4249 

1 

2.814E-02 

-0.0700 

-0.1573 

-0.4761 

-0.3452 

0.3242 

0.4754 

-0.0067 

0.9921 

-0.0962 

-0.0700 

0.0239 

-0.0394 

0.1972 

0.2493 

-0.0114 

-0.0143 

-0.0013 

-0.0023 

0.2510 

0.3356 

0 

4.446E-02 

0.6522 

0.1035 

0.2382 

0.0015 

0.4500 

-0.5239 

-0.1689 

0.0036 

0.0006 

0.0415 

1 

S.377E-02 

0.2415 

0.1678 

0.0120 

-0.0281 

0.0287 

0.1357 

-0.0276 

0.0309 

0.5011 

-0.2622 

-0.5608 

0.2388 

-0.3518 

0.2393 

0.0211 
-0.01 II 

0.0007 

-0.0016 

-0.0564 

0.1247 

1 

6.272E-02 

0.4126 

0.3722 

0.0696 

-0.0183 

0.2580 

0.1482 

0.0076 

-0.0060 

0.1087 

0.4616 

-0.1605 

-0.5238 

0.0741 

-0.2241 

-0.0097 

0.0068 

-0.0004 

0.0011 

0.1282 

-0.0141 

0 

5.500E-01 

-0.1049 

0.0356 

0.1948 

0.3466 

0.264 

-0.2453 

0.5595 

-0 .0441 

-0.0017 

-0.6736 


parity check for detecting a bias failure in 
robustness metric as well as its sensitivity 


Sensor 1, however, depends on its 
to failures. 


ROBUST DETECTION OF SENSOR FAILURES 

In the case of sensor failures, a relatively straightforward modification 
of the robust redundancy problem can be formulated. Since sensor failures 
show up in fixed directions in the space of measurable quantities we can think 
of minimizing the metric in Eq. 4-8 subject to the constraint of fixed sensi- 
tivity to a particular sensor direction. That is, we want to choose the w 
which minimizes 

J* - min Eg { ||w T Mpji|| 2 } (4-10) 

subject to 

" wTb^ ■ K 
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The solution to Eq. 4-10 is obtained by forming the Lagrangian function and 
taking derivatives with respect to w. The solution is given by 

w = X • C 0 -l bi (4-11) 

where X = K/bjTc 0 ~lb;i. and the optimal value of the cost function is J* = X*K. 
This result is intuitively pleasing since it points directly to the tradeoffs 
which must be made in designing an FDI system, namely, that greater sensitiv- 
ity to failures (as embodied by large K and relating directly to the proba- 
bility of detection, P^) is obtained only at the expense of decreased robust- 
ness (as embodied by large values of J* and relating directly to the probabil- 
ity of false alarm, PpA.)* 

Note that the above solution implies that there is only one parity check 
for each value of p which makes sense. Table 4-4 shows the result of applying 
this result (sometimes referred to as the robust detection null space ap- 
proach) to the F-100 engine model for p = 0,1 and normalizing K so that 
W T W =1. The number in the metric column is w^C^w and may be compared 
directly to the metrics obtained in Table 4-3 to assess relative robust- 
ness. 

As an alternative to the problem posed in Eq. 4-10, we may consider the 
optimization of a metric which represents the failure signal to model noise 
ratio. That is, 

w = arg max J » (w T bi) 2 / E$ { | |w T M p ji| | 2 } (4-12) 

Interestingly, the solution is identical to Eq. 4-11 where the scalar X may 
take on any value. That is, for additive sensor failures, minimizing the mean 
square value of the residual with a fixed failure sensitivity is equivalent to 
maximizing the signal-to-noise ratio. 
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TABLE 4-4. PARITY CHECKS FOR F-100 ENGINE 

(Robust Detection - Null Space Approach) 


SEKSITIVITT CONSTRAINTS 


Sensor 




Scaled Sensor Coefficients 


Scaled Control Coefficients 


Sensitivity 

Constraint 

P 

METRIC 

m 

K2 

PT4 

PT6 

FTIT 

vr 

AJ 

PCV 

SVA 

BLC 

1 

0 

4.487E-02 

0.6828 

0.2472 

0.0656 

-0.0023 

0.4463 

-0.4577 

-0.2341 

0.0080 

0.0013 

-0.0678 

2 

0 

2.231 E— 02 

0.1 357 

0.6185 

-0.5738 

0.0063 

0.1052 

0.0247 

-0.2999 

0.0168 

0.0028 

-0.4097 

3 

0 

2.266E-02 

0.0350 

-0.5573 

0.6469 

0.0023 

0.0194 

-0.1659 

0.2625 

-0.0164 

-0.0027 

0.4154 

4 

0 

8.S43E-0I 

-0.0508 

0.2525 

0.0944 

0.5901 

0.0682 

-0.1552 

0.3104 

-0.0268 

-0.0002 

-0.6714 

5 

0 

4. 391 £-02 

0.6411 

0.2752 

0.0523 

0.0044 

0.4653 

-0.4741 

-0.2471 

0.0085 

0.0014 

-0.0895 

1 

l 

3. USE-04 

-0.5640 

0.7021 

-0.3169 

0.2831 

0.0023 

0.0111 

0.0021 

0.0001 

0.0195 

0.0188 

-0.05 70 
-0.0212 

-0.0567 

-0.0062 

0.0111 

0.0001 

-0.0013 

0.0000 

0.021! 

0.0024 

2 

1 

8.S46E-04 

0.6319 

-0.6877 

0.3088 

-0.1201 

-0.0570 

-0.0690 

-0.0024 

0.0005 

0.0113 

0.0117 

0.0382 

0.0039 

0.0159 

-0.0357 

-0.0094 

0.0020 

0.0020 

0.0003 

-0.0641 

-0.0487 

3 

1 

6.366E-03 

-0.5427 

0.6170 

-0.4167 

-0.0048 

0.1986 

0.2218 

0.0044 

-0.0009 

-0.0042 

-o.ooot 

-0.0735 

-0.0500 

0.0463 

0.0930 

0.0044 

-0.0057 

t i 

o o 

© o 
o o 
O 

sO U 

0.1495 

0.1462 

4 

1 

4.984E-03 

-0.4789 

0.5498 

-0.3207 

0.2829 

0.0089 

0.0114 

-0.1895 

0.2450 

0.0010 

0.0063 

-0.0125 

-0.0363 

-0.1374 

0.1299 

0.0177 

-0.0109 

-0.0015 

0.0000 

0.2586 

-0.2913 

5 

1 

6.190E-03 

-0.3127 

0.6909 

-0.3425 

0.4805 

-0.0089 

0.0012 

0.0095 

-0.0073 

0.1126 

0.1152 

-0.1438 

-0.1138 

-0.1044 

-0.0706 

0.01 10 
0.0028 

-0.0003 

0.0004 

-0.0225 

-0.0197 


STATISTICALLY BASED METRICS FOR ROBUST FDI 

Ideally, the parity relations which are of most use are those which would 
allow us to minimize the probability of making erroneous decisions. However, 
in most situations, direct minimization of the error probability, so as to 
determine an optimal set of parity relations, is not possible. This is be- 
cause an analytic expression for the probability of error is often difficult 

% 

to come by, and even if it can be found, the expression is too complicated for 
optimization. Therefore, it is useful to search for criteria that are easier 
than the error probability to evaluate and optimize but are, in some sense, 
related to the error probability. Statistical distance or divergence measures 
between two probability distributions under two hypotheses (normal and failure 
mode i, or failure mode i and failure mode j) provide such easily compatible 
criteria. 
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Reference [12] (Appendix A) identifies (among others) two useful distance 
measures which we consider here; the J-divergence and the Bhattacharyya 
distance. The J-divergence between the two probability density functions 
(pdfs) p(v|Hi) and p(v|Hj) is defined by, 


Jij = 



[p(v|Hi) - p(v|Hj)] £n 


P( v|H ± ) 
p( vjHj) 


dv 


(4-13) 


and the Bhattacharyya distance, B^j, is defined as 


®ij = - ^ n 



IP(v|Hi) 


. 1/2 

p(v|Hj)] dv 


(4-14) 


Thus in order to compute and/or optimize Eqs. 4-13 and 4-14 we must have 
(or approximate) the pdf's under different hypotheses. Following [13], we 
assume that under the ith hypothesis, and £th model, the system is, 

x(k+l) = A 4 i x(k) + E £ i w*(k) 4 d £ l(k) (4-15) 

y(k) = C £ i x(k) + V£l(k) + bjjl(k) (4-16) 

where djjl and b£l account for additive failure effects (e.g., bias, drifts). 
Thus the measurements y(k) (and hence the residuals _v(k) ) under the £'th model 
are Gaussian random variables which can be characterized by a steady state 
mean vector and covariance matrix, and p(v|Hi) is a weighted sum of Gaussians, 
(WSOG). Computation of Jij and Bij in terms of the parameters in Eqs. 4-15 
and 4-16, and the parity check matrix, W, is now possible in principle. 
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However, solutions to Eq. 4-13 and 4-14 for the WSOG distribution are very 
unwieldy (though computable) and difficult to optimize. A more useful result 
is obtained if, for each hypothesis, we approximate the WSOG function by a 
single Gaussian function. The best Gaussian function to choose depends on the 
goals of the approximation and one can, in general, conceive of optimizing 
statistical distance criteria in choosing the parameters of the Gaussian 
approximation. However, one easily computable and commonly used approximation 
is the Gaussian function which preserves the first two moments of the WSOG 
distribution. In particular, if we let my 1 and Cy 1 represent the mean and 
covariance (both functions of W) of p(v|%), we have, 

L 

mv 1 = E* {v|Hi> = l P* v*-* (4-17) 

1 


L 

Cy 1 = cov {v|Hi> = l P A { Cy 1 * + (vi* - my 1 ) (\M - m v i ) T } (4-18) 

1 


where the quantities 

yi^ = E (v|Hj,Jlth model) and 
Cy 1 ^ = cov (v|H£, £th model) 

are directly computable from Eq. 4-] 5 and 4-16 and the parity check matrix. 

In fact, it is easy to show that my 1 = W^Yp 1 and Cy 1 = W^ Cy* W where Yp 1 and 

P 

Cy 1 are the mean and autocovariance function (ACF) of the window of 
measurements and are easily computed as in [13]. 

Using the Gaussian sum approximation embodied in Eq. 4-17 and 4-18 and 
letting C 1 and cJ denote the ACF of Yp under the ith and jth hypotheses 
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respectively, we can explicitly evaluate the distance measures in Eqs. 4-13 
and 4-14 as: 


Jij = - [Yp 1 - Ypj]T w 


T 1 -1 1 i -1 

(w c w) + (w c w) 


W x [Ypi - YpJ] 


+ - tr [(W T C J W) _1 (W^W) 
2 


+ (w T C 1 wf 1 (w T C J w) -2I t ] 


(4-19) 


1 T J ^'5 , T ^ — 0 . 5 

B = _ In det [(W C w) (w C w) 

2 

t i .0.5 , t j -0.5 . 

+ (w C w) (m o w) ] 

+ “ tV ■ Vl T W + cJ 3 W] 1 W T [Ypi - Ypj] (4-20) 

The general optimization approach for both distance measures involves a 
gradient-type scheme. However, in two special cases, which are of consid- 
erable interest in the FDI problem, the optimization provides explicit solu- 
tions. 

IDENTIFICATION OF SENSOR BIAS FAILURES OF KNOWN MAGNITUDE 

In this case only b j}- in Eq. 4-16 is different for the hypotheses Hi, 
i=0,l NO. Under H 0 , b^ = 0. This implies that the ACFs C*- = cJ = C> and 
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the distance measures (Eqs. 4-19, 4-20) are scaled versions of the Mahalonbis 
distance: 

1 

Mij - " Jij = Bij 

= l ~ [Y p i - Y p j]T W(W T CW) V [Y p i - Y p j] (4-21) 

The optimal parity relation, W, which maximizes the criterion in Eq. 4-21 is 

given by 

-i _ 

W = C [Y p i - Y p j] 

and the optimal distance measure is 

1 

M i j " " J ij = ®i j 

= ^ [Y p i - Y p j]T c" 1 [Y p i - YpJ] 

The parity relation v(k) = W^Yp(k) described above (i.e., for sensor bias 

/ 

failures of known magnitude) can be thought of as an approximate whitener 

followed by a correlator, as shown in Fig. 4-1. The approximation stems from 

the Gaussian sum approximation. Finally, we note that the residual which 
results in this case is precisely the decision statistic obtained when we con- 
sider the problem of distinguishing known signals with (p+1) observations in 
zero-mean colored Gaussian noise with autocovariance function C and signal 
means Yp* and Ypj . 
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Y<k) 



Figure 4-1. Whitener-Correlator Interpretation of Detection 
Parity Relation 


IDENTIFICATION OF NOISE VARIANCE AND SCALE FACTOR FAILURES 

This case corresponds to having different covariances under , i=0, 1, - 
...NO with the additive bias terms d^(k) and b£*(k) zero for all i and £. 

As a result Y p i = 0, but C** CJ. With this simplification, optimization of 
Jij and Bjj yield identical optimal parity relations (Appendix A) which sat- 
isfy the eigenvalue equation, 



w 


X w 


(4-22) 


The eigenvectors which make J^j and B^j largest are those which correspond to 
the eigenvalues for which X + X~1 is smallest. 


DISCUSSION 

The use of statistical discrimination metrics in defining robust parity 
relations provides us with several interesting results. 

1. As discussed above and in Appendix A, the parity checks are functions 
of the statistical characterization of the measurements (and control inputs if 
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they are available). This characterization, in the form of means and auto- 
covariance functions, was computed from the uncertain system model of Eqs. 4- 
15 and 4-16. We can, however, easily of derive parity checks from experiment- 
ally determined statistical information as well. Furthermore, insights into 
the problem of adaptive FDI may be drawn by considering the operation of on- 
line estimation of the statistics of the measurements. 

2. Although we have focussed only on the problem of determining parity 
check relationships, the metric based evaluation and optimization technique 
can be applied to the information collection or decision algorithm as well. 

For example, we might define a window of residuals and maximize the distin- 
guishability of two hypotheses by choosing a linear transformation of the 
residuals over time. We could also define simpler functional relationships 
between residuals and decision statistics and optimize the parameters of the 
relationships using an appropriate metric. Section 5 treats some specific 
algorithms in this way. 

3. In addition to defining parity relations, discrimination metrics such 
as J^j and are useful in defining performance bounds for any FDI algorithm 
in £he presence of model uncertainty. As discussed in Appendix A and [13], 
both metrics can be used to place bounds on the performance of the "optimal" 
decision rule based on the observed data. This allows fundamental performance 
limits to be determined and used for comparison in the design process. Also, 
the comparison of various system "configurations" (e.g., various sensor com- 
plements including hardware redundancy or various combinations of models) can 
be accomplished without designing an entire FDI system simply by measuring the 
distingui8hability of various hypotheses with metrics that use the probability 
density function of the observed quantities. 
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4. Evaluation of alternative FDI schemes, such as Kalman Filter based 
algorithms, is possible by applying distinguishability metrics to the 
sufficient statistics of that algorithm; i.e., using the pdf of these 
statistics in the metric computations. Section 5 documents an example of the 
evaluation of Kalman filter based algorithms using metrics applied to the 
residuals. Because of the invertibility of the Kalman filter, these results 
essentially demonstrate the information content of the sensors themselves, 
given the size of assumed model error. 
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SECTION 5 


APPLICATION RESULTS 

The theoretical work accomplished in Task 1 of this project and detailed 
in Appendix A has focussed, primarily, on various approaches to the problem of 
generating robust residuals and evaluating the associated "information con- 
tent" for the failure hypothesis testing problems of interest. 

In this section, we apply these results to the development of FDI 
algorithms for sensor bias failures and to the evaluation of Kalman filter 
based FDI. 

5.1 DECISION ALGORITHMS 

Although, it is possible to use residuals directly in making decisions, 
there are practical advantages to using further processing before decisions 
about a system's failure status is made. For example, residuals can be chosen 
to represent physical structure alone by ignoring the effects of sensor noise 
in the metric based optimized parity relation computations. As a result, re- 
siduals which are valid over a wider range of operating conditions may be 
obtained. 

We have decomposed the further processing of residuals into two compon- 
ents; information collection and decision logic. The information collection 
phase is necessitated by the fact that all decisions about the system's oper- 
ational status must be made by comparing a number (or set of numbers) to a 
threshold. Thus, all of the failure information must be noninvertibly com- 
pressed, over time, into a set of decision statistics upon which the decision 
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logic will operate to determine the system's status. Furthermore, since the 
actual performance of the overall FDI algorithm depends highly on its decision 
statistics, a useful algorithm should allow analytical assessment of these 
statistics in terms of failure distinguishability . Several candidate algor- 
ithms are outlined below. 

5.1.1 Maximum Likelihood Decision Rule 

Since the information collection algorithm operates on the residuals over 
time we consider a window of residuals, Vq(k) defined by 

Vk) T = [v T (k-q), v T (k-q+l), ..., J (k)} (5-1) 

where 

T T 

v(k) = W y Y p (k) + W u Up(k) , 

and where U p and Y p are p-windows of the NC controls and NO outputs respec- 
tively and W = [Wy|w u ] is the t by (NO+NC) (p+1) parity check matrix. 

Recall now that in our representation of uncertainty, (see Section 4) it 
turned out that v(k), and hence Vq(k) could be described statistically by a 
weighted sum of Guassians (WSOG) probability density function (pdf) for each 
failure hypothesis. That is; 

, L v i i 

Pv ( C | Hi ) = I P* N v (m* ; P £ ) (5-2) 

£=1 ^ 

where N^(m,P) denotes a Gaussian density in the variable £ with mean, m, and 
covariance matrix P and p£ denotes the a-priori proability of the £.th model. 
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In a classical m-ary hypothesis testing problem, if we can compute p v (c|Hi), 

q 

then a simple maximum likelihood (ML) decision rule can be defined. That is, 
Decide H i: if Pv ( c | H ± ) > p v U|Hj) , for all j. (5-3) 

q q 

Although Eq . 5-3 represents an optimal algorithm (in the sense of minimum 
error probability for the specified pdf's), the computation of these pdf's is 
quite involved and highly dependent on the set of models which have been as- 
sumed. We can, however, simplify the required processing and possibly reduce 
the sensitivity to the specific set of models by using a Gaussian approx- 
imation to the pdf's in Eq. 5-3. One commonly used approximation is the 
Gaussian density which preserves the first two moments of the original WSOG 
pdf. The mean and covariance of this approximate Gaussian density are given 

by, 

. A i 

E(v q |H i ) m 1 = 2 P£®£ (5-4a) 

l 

. A i i i i i T 

Var (v q |H i )- C v = l p£ [Pjj, + (mjj, - m ) (m* - m ) ] (5-4b) 

q £ 


Given this approximation, the information collection phase is considerably 
simplified. Eq. 5-3 reduces to an algorithm where decisions are made on a 
linear transformation of the windowed residuals [14]. In particular, de- 
cisions are based on the log-likelihood-ratio (LLR) statistic. 


*i 


{(mi) C v -M v q (k) 


1 . T 

- - (mi) 


..-1 


(mi) 


(5-5) 
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In Eq. 5-5, note that m 1 and C v -1 are computed off-line and we have assumed 

q 

that C v * is the same for all hypotheses as in the case of sensor bias fail- 

q 

ures. Details of this calculation are now given. 

The approximate mean, m 1 , is given by. 


m 1 = 


I P = e £ (v q (k) 1% } 
£ * 


(5-6) 



where 

n = [W y T Y p ^ + Wj U p i*] , 

Yp iA = E [Yp(k) | , £th model] 

Up^^ = E [Up(k)|Hi, ^th model] and 
I t = t x t identity matrix 

The approximate covariance matrix, C v is given by 

q 

Z Z T 

c V = I P„ ( C v * + (mi - mi) (mi - mi) } 

" t * ’ 

= Toeplitz [Cy®, Cv 1 , »•«, Cy] 

q 


(5-7) 


(5-8) 


(5-9) 
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Cy3 — WyT Cyj Wy 


( 5 - 10 ) 




Ci c^+i . • . C^+p 

^i-1 ••• Ci-fp-i 


(5-11) 


Oi-p Ci_p+i ... Ci 


and where Ci = C_i^ = E (y(k) y^(k+l) } is computed as described in Appendix A 
and [13]. 

Note that, assuming a stationary input, C v ^ is a block Toeplitz matrix 

q 

(although Cy-j is not) an hence C v is also Toeplitz. This property is useful 

q 

since reliable and efficient algorithms exist for computing C v - ^. 

q 


5.1.2 An Alternative FDI Algorithm 

The approximate ML algorithm described above represents a considerable 
simplification over the optimum ML algorithm based on the WSOG density func- 
tion. In some instances, however, further simplification may be desired. In 
particular, Eq. 5-5 requires the maintenance of a possibly large window of 
residuals and the v resulting inner product with a large pre-stored sequence. 
This processing requirement stems, primarily, from the residual sequence being 
(in general) not "white". If, however, C v were block diagonal (v q (k) uncor- 

q 

rellated or "white"), then the decision statistic, is based on a simpler 
computation, viz., 


*i ~ 


_t 1 q 

V 1 Cy° £ 

L J j=o 


v(k-j) 


where = E(v(k)|Hi) and C v is the covariance of v(k). 


(5-12) 
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Although, in general, v(k) is a correlated sequence, we can define the 


decision statistic. 


*i 


gi l v(k+j) 


(5-13) 


j=0 

and choose the g^ to maximize the distinguishability of Hypothesis i from 
other failure hypotheses. That is, we will partition the failure space into 
sets of pairwise hypotheses and design a single statistic (XjJ which is useful 
for distinguishing them. For example, suppose we consider the detection 
problem in which each failure hypothesis should be as discriminable as poss- 
ible from normal operation. We can, for example, minimize B^q as discussed in 
to find the best choice of gj. For the problem of detecting sensor bias 
failures , 


Sl ~ ( 1 l tc v ]jk l' 1 ”1 (5-14) 

j * ' 

where the term in braces is the covariance matrix of the sum in Eq. 5-13, 

[C v ] -jk is the j-k'th block matrix of dimension t x t in the Toeplitz matrix 

q __ 

C v and is the expected value of v(k) under the ith failure hypothesis, 

q 

(see Eq. 5-7). 

In order to define an FDI algorithm which utilizes Xj, we need to compute 
the statistics of X^. Decision regions (in X^ - space) are then easily deter- 
mined using a Gaussian approximation. For the detection problem, we specifi- 
cally need the mean E(X^) and variance a^, under normal operation (H 0 ) and un- 

X 

der the ith failure (H^). Specifically, 


Under H 0 : E(X*) = 0 

var(X t ) = vi T { I l [C v ] } -1 \ 

■ q i k 

J k 


(5-15) 
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(Note here that further simplification of the braced term is possible using 
the Toeplitz properties of C v ). 

q 

Under : E( X^) = g^ T • • (q+1) (5-16) 

var( X^) = as in 5-15. 

A number of observations can be made concerning the Eqs . 5-13 to 5-16. 

1. First, any sensor bias failure whose magnitude is greater than that 
assumed in computing g£ will result in a larger value of E(X-[) under Hj . 

Thus, the likelihood of making a type 2 error (choosing H 0 under H^) is 
smaller for all sensor bias failures which are larger then the design value. 
This is clearly a desirable property of FDI algorithms. 

2. Distinguishability metrics can be computed for Xj, for each q, there- 
by indicating the length of the information collection window which is needed 
to achieve the desired performance. For example, the Bhattacharyya metric, 
using the approximate Gaussian densities defined by Eqs. 5-15 and 5-16 is. 

Bio = - V { l l [C v ] r 1 \ (q+1) (5-17) 

8 j k ’ Jk 

Notice that the dependence on the window size, q, is imbedded in the bracketed 
terra. 


5.1.3 Minimizing Decision Delay for Abrubt Failures 

The algorithms discussed in 5.1.1 and 5.1.2 are based on moving window 
calculations for both residual generation and information collection. 

Implicit in the analysis of these algorithms is the assumption that the effect 
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of a failure is present throughout the window. In the case of abrupt fail- 
ures, these algorithms result in a decision delay at least as long as the 
information collection window. In cases when large uncorrellated errors are 
present in the residuals (e.g., due to sensor noise) this window can be 
exceptionally large resulting in possibly large decision delays. 

When dealing with abrupt changes in systems a significant problem is the 
one of unknown onset time. Fig. 5-1 illustrates the improvement in detecta- 
bility of a constant mean in white Gaussian noise when the optimal processing 
(maximum likelihood decision statistic) is begun at the failure onset time as 
opposed to operating over a moving window. The figure suggests that for long 
windows, one would expect significant performance improvements from the algo- 
rithm which is started at the failure onset time. 

A variety of techniques have been introduced which attempt to realize 
performance that approaches the level obtained when failure onset time is 
known. Willsky's paper [2] provides a good review of the issues and reference 
[16] is an example of current work in this area. 

To avoid the computational burdens which are typical of many solutions to 
the problem of unknown onset time (e.g., GLR with implicit failure time esti- 
mation [2]), we are interested in alternative algorithms. The use of decen- 
tralized parity relations of various orders provides us with such an altern- 
ative. 

First, note that zeroth-order parity relations, that is relations based 
on memoryless comparison of the values of inputs and outputs at the present 
time only, respond instantaneously when a failure occurs. Higher-order re- 
lations provide information with some time delay. A two-level FDI structure 
which makes selective use of these relations is now described. First, at the 
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Hn : y{ k) = q(k) 
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& °n 

B = 4T m n ^ N 
o 




WINDOW LENGTH = N 


NUMBER OF SAMPLES FOLLOWING FAILURE ONSET 


n 

R-2135 


Figure 5-1. Advantage of Known Onset Time 

monitoring level we have tests based on the shortest (lowest order) and most 
sensitive of the parity relations (preferably zeroth order) so that we have 
full coverage of all failure modes. Alarms are declared at the monitoring 
level using short intervals for information accumulation and relatively low 
thresholds. This produces very fast detection at the expense of possibly 
larger false alarm rates. These alarms, however, are not used as a positive 
failure indication, but rather as a trigger for the second level. The second 
level consists of longer running tests (beginning at the initial point in the 


41 



monitoring data window if desired, so that no time delay is introduced by the 
two-step procedure) for reliably identifying the failure mode and rejecting 
any false start. These somewhat longer tests are based on the same low-order 
parity relations used at the monitoring level, as well as other higher-order 
relations . 

The second level tests (which are triggered by alarms at the monitoring 
level) provide final reliable failure decisions. They make use of residuals 
in a selective way so as to minimize the decision errors and delays of each 
test. That is, only those residuals which provide reliable information about 
the hypotheses being tested are used as inputs to these tests. One test which 
may be employed is known as a Sequential Probability Ratio Test (SPRT). This 
test has the property that it reaches a decision in as short a time as is 
possible given the level of uncertainty in the residuals and specified prob- 
abilities of correct and incorrect decision. 

Fig. 5-2 shows a functional breakdown of an FDI system based on the pro- 
posed two level structure for detecting and identifying sensor failures in the 
F-100 engine example; details of the F-100 model were given in Table 4-1. The 
trigger mechanism consists of five weighted sum of squared residuals (WSSR) 
statistics (one for each failure mode) which are used in parallel to generate 
a single trigger. Each WSSR statistic, s k , is defined by 

SR+i = a s k + (1-a) W k (5-18) 

where, 

W k = v(k)Tc v -lv(k), 

Only those residuals which contribute significantly to the distinguishability 
of a single failure from normal operation and respond quickly to failures 
(i.e., low order parity checks) are used in each trigger test. The parameter a 
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Figure 5-2. Functional Breakdovm of FDI Algorithm 

























is chosen on the basis of tradeoffs between decision delay and probability of 
false triggering. The individual thresholds for each test are chosen based on 
Chi-squared distributions and the desired probability of missing a failure in 
the trigger mechanism. 

Following a trigger, 15 SPRT tests are initiated (one for each pair of 
failure hypotheses; five sensor failures plus the no-fail hypothesis). For 

sensor bias failures, we use an SPRT statistic, s^, which is defined by the 

log-likelihood-ratio for two means in white noise and is given by, 

_ _ T 1 _ T _ T _ 

s k = sk -1 + [Vi - vj] C v "l v(k) - - [ ^ C v _1 ^ - vj C v -1 Vj] (5-19) 

Only those residuals which contribute significantly to the distinguishability 
of failure mode i from failure mode j, (note: v 0 = 0) are used in each SPRT. 

The SPRT algorithm chooses hypothesis i over hypothesis j if > t + , hyp- 
othesis j over i if s^ < t - , and continues sampling when t“< s^ < t + . A 
decision is made when all the SPRT's "vote” in favor of a single hypothesis 
over all others. When a decision event doesn't occur, we are left with an 
indication of the ambiguity set which the algorithm can not resolve using the 
current data. Finally, note that in Eq. 5-19, the size and sign of the sensor 
bias failure is assumed known. When neither size or sign is known, the SPRT's 
must be modified so that any triggerable failure is correctly isolated; e.g., 
see [20] . 

5.1.4 Simulation Results 

The algorithm of Fig. 5-2 was coded and tested on a linear simulation 
of the F-100 engine at h = 10k, M = .6, PLA = 83°. Five zeroth order robust 
detection parity checks using the "null space" approach (Section 4) were gen- 
erated based on model uncertainties corresponding to the curve fit errors [11] 
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of the reduced order linear model being used and subsets of these selected for 
each test. The "jth" WSSR and "jth" SPRT tests for distinguishing failures 
from normal behavior used only a single parity check (the one corresponding to 
sensitivity to the jth failure) while the "i-jth" SPRT tests for distinguish- 
ing between failure i and failure j used two parity checks (the ones corres- 
ponding to sensor failure i and sensor failure j sensitivity). Table 5-1 
shows the values of the coefficients which multiply scaled versions of the 
measurements and control values for each parity check. The number in the 
metric column is related to the robustness of the parity check, where zero is 
a perfect parity check and each metric must be smaller than 5.6 (largest 
eigenvalue of the "C 0 " matrix in Section 4). Notice that failures of sensor 1 
and 5 show up in parity checks one and five in a very similar manner. This 
implies a possible difficulty in distinguishing these two failures. 

Figure 5-3 shows the five residuals (scaled by 1000) obtained from the 
corresponding parity checks and noiseless measurements generated by the linear 
F-100 engine model with fuel flow as the only input (modeled as a first order 
Markov process with 0.1 sec. time constant and standard deviation of 350 PPH) . 
Notice that even without sensor noise, the residual is not identically zero. 
Variations in the state modulate the modeling errors through the imperfect 
parity checks. A 50 rpm bias is abruptly introduced at t = 1.0 sec. and shows 
up, as expected, primarily in residual numbers one and five, (see Table 5-1). 
Figure 5-4 shows the five WSSR statistics with a = .37 for each test. 

Clearly both WSSR statistics one and five can trigger SPRT tests when a 
failure of sensor 1 occurs. Fig. 5-5 shows the SPRT tests designed to distin- 
guish each failure from no failure (s^o, 5). The WSSR thresholds 

were chosen so that a false trigger was initiated at .42 seconds. 
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TABLE 5-1 

ZERO th ORDER ROBUST DETECTION PARITY CHECKS 
(Null Space Approach) 


Sensor 




Scaled Sensor Coefficients 



Scaled 

Control Coefficients 


Sensitivity 

Constraint 

P 

Metric 

Nl 

N2 

PT4 

PT6 

FT IT 

UP 

AJ 

FGV 

SVA 

BLC 

l 

0 

4.487E-02 

0.6828 

0.2472 

0.0656 

-0.0023 

0.4456 

-0.4577 

-0.2341 

0.0080 

0.0013 

-0.0678 

2 

0 

2.231E-02 

0.1357 

0.6185 

-0.5738 

0.0063 

0.1052 

0.0247 

-0.2999 

0.0168 

0.0028 

-0.4097 

3 

0 

2.266E-02 

0.0350 

-0.5573 

0.6469 

0.0023 

0.0194 

-0.1659 

0.2625 

-0.0164 

-0.0027 

0.4154 

4 

0 

8.543E-01 

-0.0508 

0.2525 

0.0944 

0.5901 

0.0682 

-0.1552 

0.3104 

-0.0268 

-0.0002 

-0.6714 

5 

0 

4. 391E-02 

0.6411 

0.2752 

0.0523 

0.0044 

0.4653 

-0.4741 

-0.2471 

0.0085 

0.0014 

-0.0895 


Scale Factors Definitions 

T T 

v = W v + W u 
y y U u 

y = S • Y 

y 

u = S • U 
u 

S y = diag [lE4, 1.5E4, 550, 130, 1600 ] 
S u = diag [1.5E4, 5, 50, 10, l] 


Following the false trigger, all SPRTS in Fig. 5-5 indicate that no failure 
has occurred by t = 0.70 seconds by crossing the threshold t” = -9.2, and are 
therefore turned off. The thresholds (t + = 9.2 and t~= -9.2) correspond to 
equal probabilities of type 1 and type 2 decision error and are determined 
from equations found in [2] and [6]. Finally, Fig. 5-6 shows the five SPRT 
tests for distinguishing sensor number 1 failures from all other hypotheses. 
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As expected the most difficult decision (and hence longest SPRT decision 
delay) is between sensor //I and #5. However a correct isolation of sensor 
failure itl is obtained at t = 1.24 seconds when s^g > t + , and S 21 , S 31 , S 41 , 
s 5 i < t“. 

In summary, we have presented three potential decision algorithms for 
detecting and isolating sensor bias failures in the F-100 jet engines at a 
single operating point. One of these was chosen and implemented in Fortran 
code (see Appendix C) and demonstrated with a linear simulation of the F-100 
engine. 

In the next subsection, we apply the theoretical results of Section 4 to 
the evaluation of Kalman filter based algorithms. 

5.2 KALMAN FILTER EVALUATION 

As discussed in Section 4 and Appendix A, the metrics we have proposed 
for use in selecting parity checks can also serve as an evaluation tool for 
any FDI system. Task 3 of this project consists of evaluating FDI algorithms 
based on Kalman filter residuals and in particular, the algorithm discussed in 
reference [3]. 

The algorithm reported in [3] consists of a single Kalman filter (KF) 
which uses all five available measurements to estimate the four state vari- 
ables (see Table 4-1 for model details) and produce a vector of five re- 
siduals. These residuals are then used to form a weighted sum of squared 
residuals (WSSR) detection statistic. The WSSR is summed over a finite 
window, and compared to a single threshold for detecting the presence of a 
sensor failure. Isolation of the failed sensor is accomplished through the 
use of five KF's (KF^ , . . . ,KF 5 ) , each of which makes use of only four of the 
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available measurements. The residuals from the five isolation filters are 
then used to form 5 WSSR statistics which are then low-pass-filtered (a re- 
cursive, exponential age weighting of the WSSR statistic) and compared to the 
statistic generated by a KF which uses all five measurements, (KFq). Ideally 
when sensor i fails, the statistics produced by KFq and KFj, j*i, will "di- 
verge" due to the presence of a failed sensor in each of these filters. The 
statistic produced by KFj[ will remain comparatively small. 

A number of error sources are possible contributors to poor performance 
of this algorithm. First, in the detection phase, modeling error produces 
non-ideal residuals, (namely nonwhite), thus causing a decrease in detection 
performance from that which could have been predicted on the basis of the 
chosen WSSR weights and thresholds. The exact nature of how modeling error 
can effect performance depends on the choice of Kalman Gain as well. Thus, 
our metric based approach to assessing the impact of modeling error on KF- 
based FDI algorithms depends highly on the choice of Kalman Gain. 

To demonstrate the use of metrics i. /aluating Kalman Filter based algo 
rithms, we designed a Kalman Gain using "LQGALPHA,” a software package devel- 
oped by ALPHATECH which includes the capability of computing steady state 
Kalman Gains from specified process and sensor noise covariances matrices. 

The standard deviations for the corresponding sensor noises were as 
follows : 

on! =13.9 RPM 
on2 =12.1 RPM 
optr = 0.67 PSI 
OpT5 = .09 PSI 
OpiIT = 2.2° F 
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A fuel flow variance of 350 PPH, (all other control inputs assumed zero) 
was used. These noise variances were then scaled in accordance with the 
scaling (both output scaling and "correction factors”) applied to the nominal 
model. The resulting gain (discrete time KF formulation) is shown in Table 
5-2. The equivalent continuous time filter has a transfer function (states 
to state estimates) with eigenvalues of about -40, -6, -2, -.8 (rad/s.). 

Note that the mode corresponding to the smallest eigenvalue (-40) has a time 
constant on the order of the sampling time (T = .02 sec) indicating that cer- 
tain linear combinations of the measurements are used directly at each time 
step with little or no filtering. 


TABLE 5-2. KALMAN GAIN AND EQUIVALENT CONTINUOUS TIME TRANSFER 
FUNCTION (STATES TO ESTIMATE) AT "DC” 


Discrete Time Steady State Kalman Gain 


1.4870000E-01 
7 . 2420000E-02 
4.8110000E-03 
5.8830000E-03 


2 . 1830000E-01 
1.0640000E-01 
7.0520000E-03 
8.6410000E-03 


7.4580000E-02 

3.6350000E-02 

2.4070000E-03 

2.9500000E-03 


1.4280000E-01 

6.9550000E-02 

4.6140000E-03 

5.6430000E-03 


-2.71 60000E-01 
-1.3230000E-01 
-8.7930000E-03 
-1 . 07 50000E-02 


Figure 5-7 shows the five KF residuals of the nominal KF (Table 5-2) with 
an Nl-bias failure of 50 RPM occurring at t = 1.0 seconds. The truth model 
used to generate the residuals in Fig. 5-7 was chosen from the set of randomly 
perturbed system matrices used in the robust parity check design process. 
Software for implementing this process is described in Appendix E. The system 
matrices were generated by adding zero-mean Gaussian errors to the nominal 
system matrices of Table 4-1. The variances of the individual terms were 
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obtained from the reduced order model curve fit errors (” 0 y/ x ") reported in 
[11] (and shown in Table 4-1), and checks for outliers are performed before 
using any result. No sensor noise is simulated since the realistic values of 
noise variances tend to obscure the robustness results. A 1st order Markov 
model for fuel flow is used as the input with a correlation time of 0.1 
seconds and a steady state standard deviation corresponding to 350 PPH. This 
input is intended to model the correlation structure of the input generated by 
the control system which regulates the engine at this operating point. 

With no sensor noise and measurable control inputs, an ideal filter would 
have residuals which are identically zero during no failure and a bias in each 
residual following the failure. Thus the effects of model mismatch are easily 
derived from Fig. 5-7. The expected residual bias (scaled by 1000 as in the 
figure) due to an Nl-failure can be calculated using the KF evaluation results 
discussed subsequently and in Appendix A. In particular, E{\^p} =» [4.1, -0.4, 
-0.3, -0.2, 1.5]T. As is evident in the figure, deterioration of the KF resi- 
duals due to modeling error is not particularly severe in terms of detecting 
an Nl-sensor failure if one allows for the non-ideal behavior in setting 
thresholds. 

5.2.1 Evaluation of KF Using the Bhattacharyya Distance 

Characterization of KF performance cannot, of course, be realistically 
inferred from a single simulation. The use of performance metrics, however, 
can provide a realistic characterization by assessing the ensemble- 
averaged impact of model errors in terms of FDI performance. First, however, 
we must extend the results of Appendix A to include systems with known control 
Inputs. Consider a Kalman Filter which is based on the nominal system. 
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x(k+l) = A o x(k) + B 0 u(k) 


(5-20) 


y(k) - C Q x(k) + D 0 u(k) (5-21) 

The mean and covariance of the residuals for the nominal KF under the ith 
failure and the £th model hypothesis (A£,B£,C£,D£) are computed, from the 
dynamic equations , 


x a( k+ l) = f £ x a (k) + G £ u(k) + K (v k + b*) 


v i^(k) = H£ x a (k) + M £ u(k) + Vfc + b l 

where, we have. 


F* 


A £ 0 

AqKqC £ Aq[T--ICoCo] 


(5-22) 

(5-23) 


(5-24) 


G£ 


® £ 

B 0 ■*" AqKq [ D £-D q ] 


(5-25) 


K 


0 


(5-26) 


H£ - [ C£ | -C 0 ] 


(5-27) 


M £ — D £ — D 0 


(5-28) 
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Thus, under the ith failure and ith model we have, 

v 1 * A E( vlHj.i) = H£ x i£ '+ u + bi (5-29) 

a 

where 

xi £ = [ I - F £ ]"1 { G £ u + K b t } (5-30) 

and 

C l A cov( v| H ± , V) = Fo C A F t + G £ Q G t + K R KT (5-31) 

a ~ a i i 

where 

R = cov(v^) 

Q = cov(uk) 

Note that, in Eqs. 5-20 and 5-21, the measurable input sequence is assumed to 
be a white noise process. Other statistical characterizations of the input 
sequence can be included by adding additional states to Eq. 5-22. Highly 
correlated input sequences tend to be the "worst case" for evaluation purposes 
since the modeling error (which is modulated by the state) would then tend to 
produce highly correlated residuals and be most confused with sensor biases. 

These results were applied to the KF described by the Kalman gain in 
Table 5-2 in order to assess the mismatched-model-ensemble-average FDI per- 
formance. Table 5-3 shows the pairwise B-distances between sensor bias 

failure modes (positive and negative failures of sensors; i = 1 5, and no 

failure; i = 0), based on a Gaussian approximation to the "weighted sum of 
(10) Gaussians" probability density function which is implied by our multiple 
truth model assumptions and the input and sensor noise covariances used to 
generate the gain in Table 5-2. In the table, larger distances imply lower 
achievable misclassif ication rates when the assumed Gaussian statistics are 
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TABLE 5-3. BHATTACHARYYA DISTANCES FOR SENSOR BIAS FAILURES 
KALMAN FILTER RESIDUALS 


SENSOR 

FAILURE 

0 

1 

2 

3 

4 

N1 

1 + 

1.111E+00 






- 

1.111E+O0 





N2 

2 + 

1.562E+00 

3 . 321E+00 





- 

1.562E-KD0 

2.024E+00 




PT4 

3 + 

2. 146E+00 

3.581E+00 

4.238E+00 




- 

2.146E+00 

2.933E+00 

3 . 179E+00 



PT6 

4 + 

9.798E+00 

1.109E+01 

1.165E+01 

1 . 165E+01 



- 

9.798E+00 

1.073E+01 

1.107E+01 

1.224E+01 


FTIT 

5 + 

4.600E-01 

6. 593E-01 

1 . 258E+00 

2 . 170E+00 

9.861E+00 


- 

4.600E-01 

2.482E+00 

2 . 786E+00 

3.042E+00 

1.065E+01 


used. Note that the B-distance for detecting FTIT sensor bias failures is 
quite small indicating that the assumed failure size of 8°F may not be easily 
detectable. Other failure sizes assumed in Table 5-3 correspond to 50 RPM for 
N1 and N2 sensors and 3 PSI for PT4 and PT6. In addition, the B-distance be- 
tween positive bias failures of N1 and FTIT is also small enough to indicate 
potential misclassif ication problems. 

The separate effects of sensor noise and modeling error can easily be 
assessed by setting the sensor noise equal to zero and computing the pair-r 
wise B-distances. These distances are given in Table 5-4. All distances are 
larger than in Table 5-3 as expected although the relative ordering changes 
somewhat . 

Finally, Table 5-5 shows distances for a perfectly known model with non- 
zero sensor noise. Comparing Table 5-5 with Tables 5-3 and 5-4 we can con- 
conclude that deterioration of the filter (smaller distances) due to model- 
ing error is not particularly severe in terms of potential FDI performance. 
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TABLE 5-4. BHATTACHARYY A DISTANCES FOR SENSOR BIAS FAILURES 
KF WITH NO SENSOR NOISE 


SENSOR FAILURE 

0 

1 

2 

3 

4 

1 

+ 

1.680E+01 






- 

1.680E+OL 





2 

+ 

2.197E+01 

5.801E+01 





- 

2. 197E+01 

1.953E+01 




3 

+ 

2.591E+01 

4.274E+01 

5.754E+01 




- 

2.591E+01 

4.269E+01 

3.824E+01 



4 

+ 

1.396E+01 

2.545E+01 

4.264E+01 

3.927E+01 



- 

1.396E+01 

3.608E+01 

2 . 923E+01 

4.049E+01 


5 

+ 

4.275E+00 

1.03IE+01 

2. 200E+01 

2.401E+01 

1.708E+01 


- 

4.275E+00 

3.185E+01 

3.050E+01 

3. 636E+01 

1.940E+01 


TABLE 5-5. 

BHATTACHARYYA DISTANCES FOR 
KF WITH NO MODELING ERRORS 

SENSOR BIAS FAILURES 

SENSOR FAILURE 

0 I 2 

3 4 


1 

+ 

1.294E+00 






- 

1.294E+00 





2 

+ 

1.864E+00 

3.731+00 





- 

1.864E+00 

2.583E+00 




3 

+ 

2.395E+00 

4.006E+00 

4 . 564E+00 




- 

2.395E+00 

3.372E+00 

3 . 955E+00 



4 

+ 

1.313E+02 

1 . 352E+02 

1 . 357E+02 

1.351E+02 



- 

1.313E+02 

1.300E+02 

1 . 307E+02 

1 . 324E+02 


5 

+ 

6. 828E-01 

8.611E-01 

1.480E+00 

2.490E+00 

1.272E+02 


“ 

6.828E-01 

3.092E+00 

3.614E+00 

3.667E+00 

1.368E+02 
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However, it should be noted that the major effect of modeling errors (Table 
5-5 vs. 5-3) appears to be in the distances involving sensor number four 
(PT6), suggesting possible filter error and FDI problems for filters which 
rely heavily on PT6 for updating state estimates. 


5.2.2 Extension of the Evaluation Technique for Unknown Base Points 

The results above suggest that error sources other than parametric 
modeling error may be the primary cause of poor performance of the algorithm 
in [3]. One source which has been neglected until now is the uncertainty in 
the linearization or base-point of the linearized system model. To include 
this error source in our analysis, we proceed as follows. 

6x(k+l) = A 0 <5x(k) + B 0 6u(k) (5-32) 

6y(k) - C Q 6x(k) + D 0 6u(k) (5-33) 

where <5x, 6y, and 6u represent deviations from a nominal set of basepoints, 

X B ° « Y B ° and U B ° . The statistics of the KF residuals under the £th model 
hypothesis (A£, B£, C£, D£, X B ^» y b^> U b ^) and the ith bias failure hypothesis 
(y = C£x + D^u + b^) can then be computed from the dynamic equations; 

x a (k+l) = F £ x a( k ) + G£ Sufc + K(v] t +bi) + b a (5-34) 

v(k) = H £ x a (k) + M £ 6u(k) + v^ + bi + b v (5-35) 


where F£, G£, H£, D£, and K are the same as in Eq. C-5 through C-9, and; 

r I 1 ^*] [*b* - V] - B£ [Ub £ - u B °] ~1 


b a 


AoK 0 [(Y B £ - y b°) - C£ (X B 4 - X B °) - D£ (U B * - U b O)] 


bv = (Y B * - y B°) - C £ (X B * - X B °) - D £ (U B * - U B °) 
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The nominal base points for the F-100 engine model (Table 4-1) correspond 
to the unsealed values shown in Table 5-6. Clearly, comparing the bias sizes 
which we are evaluating (see Table 5-7) with Y B °, even a 1% error in base 
point determination will result in severe degradation of FDI performance. 

Table 5-7 shows the degradation in terms of the Bhattacharyya distance metric 
for a 1% error perturbation of all base points. Comparing Tables 5-7 and 5-3, 
we see that base-point uncertainty can result in up to a 3-order-of-magnitude 
decrease in pairwise B-distances. Not surprisingly, the B-distances assoc- 
iated with sensor 4 (PT6) are affected least since the nominal base point (and 
hence the 1% perturbations) for this sensor is smallest (in comparison to its 
corresponding failure magnitude) . Thus a preliminary conclusion that can be 
drawn regarding the use of this KF for FDI is that base point accuracy plays a 
major role in defining the limits of FDI performance. 


TABLE 5-6. BASE POINT VALUES AND SENSOR FAILURES 


X B ° = [8900 rpm, 11,700 rpm, 186° F, 139<>F] 

U B ° = [8000 PPH, 3 ft. 2 , -25 deg., 3 deg., 0%] 

Y b ° = [8900 rpm, 11,700 rpm, 284 PSI, 42.5 PSI, 1343° F] 


SENSOR 

Bias [50 rpm, 50 rpm, 3 PSI, 3 PSI, 8°F] 

Failure 

Magnitudes 

1 % errors [89rpm, 117rpm, 28.4PSI, 4.25PSI, 13.4°F] 

in Y b ° 
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TABLE 5-7. BHATT ACHARYY A DISTANCES FOR SENSOR BIAS FAILURES 
KF WITH 1 % UNCERTAIN BASE POINTS 


SENSOR FAILURE 

0 

1 

2 

3 

4 

1 

+ 

2. 149E-02 






- 

6.902E-02 





2 

+ 

6.291E-02 

6. 557E-02 





- 

1.659E-01 

2.193E-01 




3 

+ 

1.358E-01 

1.781E-01 

2. 114E-01 




- 

9.107E-02 

1.050E-01 

2.099E-01 



4 

+ 

2.799E+00 

2.740E+00 

2.867E+00 

3.652E+00 



- 

2.812E+00 

2.928E+00 

2.939E+00 

2 . 152E+00 


5 

+ 

3.063E-02 

2.043E-02 

3.675E-02 

1 . 548E-01 

2.778E+00 


' - 

8.625E-02 

1 . 526E-01 

2.745E-01 

1.546E-01 

2.91 5E+00 


5.2.3 Results for KF Improvement 

FDI schemes which rely on residuals from a Kalman Filter are fundamen- 
tally "centralized" approaches. That is, a large dynamic system model is 
developed and all information relating to that model is used without regard to 
the quality of the corresponding model. Efforts to tune a Kalman filter based 
on known uncertainties can be effective, but these efforts are largely heur- 
istic in nature and no well-defined and quantifiable methodology exists. The 
generation of residuals through robust parity checks is fundamentally a "de- 
centralized" approach to FDI. That is, only the best relationships between 
measured variables are selected and used individually in the FDI process. 

Thus, one might expect that improvements to the Kalman filter based approach 
might be achieved if some sort of partial decentralization could be performed. 
Figure 5-8 gives one example of a method for partial decentralization. In 
Fig. 5-8a, the measurements, y(k), are decomposed into two sub-vectors yi(k) 
and y 2 (k). When the model for y 2 is not well known for example, severe 
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estimation errors may occur in the nominal filter as the "good” information 
(y^) is mixed with the "bad", (y2) through the nominal gain matrix Kq. In 
Fig. 5-8b filtered state estimate are formed using only yj(k). The resulting 
estimates, (which are not corrupted by the inaccurately modeled measurements 
y 2 (k) ), could then be used to generate residuals corresponding to the measure- 
ments y 2 (k). Of course, if one uses the ill-modeled description of y 2 (k) 
(namely C 2 ) the residuals, V 2 , will contain errors. Alternatively, one might 
consider using y 2 (k) and yi(k) in a robust parity relation to form additional 
residuals or estimates of y 2 . 

In order to determine the potential for partial decentralization for the 
F-100 engine model we considered the use of metrics for FDI in determining 
the quality of various parts of the overall model. In particular, we con- 
sidered five multi-input-single-output (MISO) systems corresponding to each 
of the five measurable quantities (Nl, N2, PT4, PT6, FTIT). We then gener- 
high order (p = 5) robust parity checks for each of these systems using the 
robust-redundancy null-space approach (see Appendix E) . This approach, effec- 
tively, gives the best Auto-Regressive-Moving-Average (ARMA) model for the 
single output system from the set of randomly perturbed models we have defined 
In addition, the approach provides a metric for comparing the quality of these 
ARMA models (the smallest eigenvalue of the symmetrized partial observability 
matrix). The results are shown in Table 5-8. Comparisons of the metrics 
corresponding to the best ARMA model (parity check) for each MISO system sug- 
gests that the models for PT4 and FTIT are particularly poor in comparison to 
the other sensors. Thus, it is possible that improved filter performance may 
be achieved by removing PT4 and FTIT sensors from the filter. Of course, 
estimates of PT4 and FTIT may be needed for the overall control law, in which 
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case the robust ARMA models of Table 5-8 or those generated through other 
optimization strategies (see Appendix B) may be used. 



TABLE 5-8. 

EIGENVALUES AND ARMA 

COEFFICIENTS 

(P = 5) 

FOR MIS0 

SYSTEMS 

SENSOR 

METRIC 

SCALED AR 
COEFFICIENTS 


SCALED 

MA COEFFICIENTS 



3.7 10E-05 

-0.1137 

-0.0057 

-0.0081 

0.0020 

-0.0004 

0.0045 



0.3850 

0.0130 

0.0184 

-0.0046 

0.0009 

-0.0103 

N1 


-0.5628 

-0.0139 

-0.0196 

0.0049 

-0.0010 

0.0110 



0.5669 

0.0132 

0.0185 

-0.0046 

0.0009 

-0.0104 



-0.4210 

-0.0067 

-0.0093 

0.0023 

-0.0004 

0.0052 



0.1457 

0.0000 

0.0000 

0.0000 

0.0000 

0.0000 


4 . 770E-06 

-0.1192 

-0.0028 

-0.0019 

0.0001 

0.0003 

0.0015 



0.3921 

0.0062 

0.0042 

-0.0003 

-0.0006 

-0.0035 

N2 


-0.5645 

-0.0066 

-0.0045 

0.0003 

0.0007 

0.0037 



0.5671 

0.0063 

0.0043 

-0.0003 

-0.0006 

-0.0035 



-0.4148 

-0.0031 

-0.0021 

0.0002 

0.0003 

0.0017 



0.1393 

0.0000 

0.0000 

0.0000 

0.0000 

0.0000 


1 .050E-02 

0.0896 

-0.0190 

0.0394 

-0.0026 

-0.0005 

0.0671 



-0.2932 

0.0640 

-0.1278 

0.0084 

0.0015 

-0.2202 

PT4 


0.4196 

-0.0929 

0.1821 

-0.0120 

-0.0019 

0.3158 



-0.4213 

0.0935 

-0.1827 

0.0120 

0.0019 

-0.3172 



0.3093 

-0.0695 

0.1336 

-0.0088 

-0.0013 

0.2333 



-0.1041 

0.0240 

-0.0446 

0.0030 

0.0004 

-0.0788 


8.881E-04 

0.0734 

-0.0082 

0.0403 

-0.0034 

0.0001 

-0.0889 



-0.2377 

0.0267 

-0.1296 

0.0106 

-0.0001 

0.2874 

PT6 


0.3390 

-0.0382 

0.1843 

-0.0149 

0.0002 

-0.4096 



-0.3402 

0.0383 

-0.1848 

0.0149 

-0.0002 

0.4110 



0.2383 

-0.0280 

0.1345 

-0.0107 

0.0001 

-0.2998 



-0.0829 

0.0094 

-0.0446 

0.0035 

0.0000 

0.1000 


2.261E-02 

-0.0729 

0.0755 

0.0511 

-0.0035 

-0.0001 

0.0170 



0.2460 

-0.2478 

-0.1635 

0.0097 

0.0006 

-0.0628 

FTIT 


-0.3587 

0.3557 

0.2311 

-0.0124 

-0.00 11 

0.0958 



0.3611 

-0.3572 

-0.2316 

0.0122 

0.0012 

-0 .0971 



-0.2682 

0.2619 

0.1675 

-0.0080 

-0.0010 

0.0747 



0.0927 

-0.0880 

-0.0547 

0.0020 

0.0004 

-0.0277 
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5.2.4 Evaluation of the WSSR Statistic 


In order to fully characterize the performance of any FDI algorithm, one 
clearly needs to consider a statistical characterization of the decision 
quantities used in the decision process. The algorithm of ref. [3] makes 
extensive use of the weighted sum of squared residuals (WSSR) statistic which 
takes the form, 

N-l 

d K F = I ^(k+J) W-l v(k+j) (5-36) 

j=0 

To evaluate the WSSR statistic we note that for statistics with greater 
than 15 degrees of freedom, the actual probability density function (which is 
a chi-squared function) is well characterized by its first two moments and, 
hence, can be considered approximately Gaussian. In the algorithm of ref [3] 
we have 5 residuals times 3 samples (15) degrees of freedom. The mean and 
covariance of dj^p (which involve 1st, 2nd, and 4th moments of v(k)) are given 
by: 

E [ d KF ] = N • Tr[W-! C v °] (5-37) 

where C v ° is the steady state covariance matrix of the approximate Gaussian 
density for the residual sequence, and, 

N-l N-l 

Var [d^] = I l { 2*Tr[W _L cJ _k W -1 C k_ j] 

k=0 j=0 (5-38) 

+ Tr [W _1 Cj _k ] • Tr [W _1 C k "j] - Tr 2 [W" L C°] } 

where C* = E[v(k) v^(k+i)] and is computed from Eq. 5-31. 
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Using the statistics of each WSSR quantity the results of Section 4 and 
Appendix A could be applied. Various metrics which indicate the distinguish- 
ability of the various failures modes could be computed and compared to 
similar metrics for other FDI algorithms. 

In summary, the KF we have designed (and potentially that of ref. [3]) 
appears to be fairly robust to the relatively substantial model errors defined 
in [11]. If these were the only errors present, acceptable FDI performance 
would be expected. However, the full envelope algorithm must be able to 
predict base points as well as the system matrices which govern behavior of 
the engine near these points. We have shown that the ability to detect sensor 
failures is highly sensitive to base point uncertainty. Finally, we have 
developed equations for the mean and covariance of a WSSR statistic in the 
presence of model uncertainty and noise. These quantities can then be used to 
evaluate the FDI algorithm in more detail by computation of the distance 
metrics of Section 4. 
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SECTION 6 


CONTRIBUTIONS AND RECOMMENDATIONS 

The major contributions of this work are listed below. 

1. A statistical characterization of distinguishability of failures has 
allowed us to evaluate FDI performance in the presence of model uncertainty 
and noise. Furthermore, optimization of distinguishability metrics allow us 
to choose the best linear relations among measurable variables for use in FDI. 
These relationships (called parity checks) are used to form residual signals 
whose behavior under different failure modes is maximally distinguishable. 
Unlike residuals formed in a centralized manner (e.g., using a Kalman filter 
based on a nominal model), the optimized parity check approach can select only 
parts of the system model to form residuals and thus represents a decentral- 
ized approach to FDI. Only the most well known parts of the system model are 
used for defining residuals. Trade-offs can then be made which incorporate 
notions of model uncertainty and failure distinguishability. 

2. An FDI design methodology which utilizes our theoretical results has 
been defined. Rather than defining a "canned" design rule, which tends to 
obscure important design considerations, we have formulated the goals and 
design choices which the FDI designer must make and have identified tools 
which may be used in the design process. 

3. The evaluation of system design issues (such as sensor configuration 
and model accuracy) in terms of FDI performance can be made without complete 
design of an FDI algorithm. The use of distinguishability metrics applied to 
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the measurements themselves provide lower bounds on the error probabilities of 
any FDI system which has access to the same measurements. 

4. The evaluation of alternative FDI schemes is made possible by 
applying distinguishability metrics to the sufficient statistics of the 
algorithm. In the case of [11], this involves computing the distribution of 
weighted sums of squares of Kalman filter residuals. We have shown that 
performance is, in general, highly dependent on the choice of weights as well 
as Kalman Gain, and for [11] in particular, the uncertainty in linearization 
point plays a major role. 

5. Application of our design techniques to a reduced order F-100 engine 
model at a single operating point has demonstrated the feasibility and 
usefulness of this approach. However, the highly nonlinear nature of the jet 
engine (both during transients and during operating point changes) produces a 
new issue; that of robust-adaptive FDI. Here, "adaptive" refers to the 
scheduling of residual generation and decisionmaking parameters with different 
operating conditions and during transients. The metrics developed in this 
project provide a useful starting point for work in this area since they allow 
quantitative assessment in determining the boundaries between adaptivity and 
robustness. That is, we can begin to answer questions such as, how good is a 
given linear model over a particular "parameter scheduling region", when must 
we change to another model, and when must nonlinear models be considered? 

6.1 RECOMMENDATIONS FOR FURTHER WORK 

A number of theoretical and applications oriented extensions are 
suggested here for future work. 

1. First, the application and extension of our results to a highly 
nonlinear system such as the jet engine can serve to both demonstrate the 
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usefulness of these concepts and extend the theory and methodologies in 
directions which address the application specific issues in detail. For the 
jet engine problem, we believe that the issue of adaptivity versus robustness 
is of primary importance. Clearly some amount of adaptation is desirable 
since large differences in dynamic relationships exist between engines as well 
as in a single engine over its operational lifetime. A "closed loop" adaptive 
technique which operates on the same time scale as the FDI algorithm is not 
sufficient since adaptation to failures is possible. Parameter scheduling 
techniques should be used to transition between sets of reduced order models 
if one can determine the important parameters on which to schedule, and when 
scheduling is necessary. 

In this regard, the results of this project can be applied in several 
interesting ways. First, in determining the validity regions of reduced order 
models one may consider forming sets of models through an identification 
algorithm operating over increasing regions in some space of parameters. As 
this region grows one would expect (particularly in the case of parameters 
being control variables and reduced order models being linear approximations 
to a nonlinear system) that modeling error would also grow. Quantifying the 
modeling error is, then, all which is needed for evaluating each model in 
terms of FDI performance. The largest region in parameter space which 
maintains a desired level of FDI performance can then be selected and the 
validity region for the corresponding model determined. 

Alternatively one could define metrics which measure distances between 
the reduced order models themselves, or between reduced order models and a 
truth model. As in our results, these metrics would include the effects of 
model uncertainty in the distance definition. Clearly, a robust model is one 


70 



which is close (in terms of a metric to be defined) to the truth model even 
for large changes in the parameters of the reduced model. Finally, when 
linear models are used, uncertainty in the linearization point must be con- 
sidered as a substantial source of error. That is, for stable systems, the 
error in predicting the steady state impulse response must be determined and 
included in the relevant metrics for determining model robustness. 

Finally, a functional breakdown of a full envelope algorithm is presented 
in Fig. 6-1. 

2. A second useful application to which our results may be readily 
applied is in the area of system design and analysis. The dependency of 
over-all system reliability, life cycle costs, and "maintenance down-time" on 
performnce (reliability) of the FDI algorithm is well known. Tradeoffs 
between sensor cost, weight, duplication and FDI performance can now be made 
without deriving an FDI algorithm for each option, and the impact on overall 
system issues determined. Ultimately, one might conceive of an "expert" FDI 
design aid which can answer relatively high level design and maintenance 
questions by accessing the detailed analysis provided by this and other 
efforts. 

3. Within the FDI algorithm itself, some important issues in robust 
decisionmaking (i.e., how to design and evaluate the elements of the decision 
process) can be addressed with the results of this project. In several 
application areas (such as flight control) the modeling of dynamic 
relationships is in a fairly mature state. Thus, in the fundamental FDI 
decomposition (Fig. 2-1) the design of residual generation or parity relations 
is straight-forward (and, note, in many cases, nonlinear). In such systems, 
optimized parity checks may be of less interest. However, we can achieve 
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robustness in the decisionmaking process (as discussed in Section 3) by the 
selective use of the known redundancy relationships. The metrics we have used 
could be applied to solve this "selection" problem. 

In the case when knowledge about modeling uncertainty is available but 
varies over the same time scale as the decision algorithm, metrics provide a 
starting point for determining parameters of decision algorithms which might 
"adapt" to changing distinguishability conditions. Ultimately, in the case of 
changing distinguishability conditions one would like to develop a sequential 
testing procedure which, in effect, waits for favorable conditions before 
decisions are made. The relevant parameters of such a test would then be 
optimized at each time step. 

Finally, in many cases, a decision algorithm is useful only if its per- 
formance is guaranteed for a large class of failures (e.g., all bias failures 
larger than a minimal failure magnitude). In some instances (e.g., tests for 
bias failures vs. no bias), this property is a direct result of the direct 
application of simple decision rules. However, it Is not guaranteed in gener- 
al (e.g., tests which must distinguish between two bias failure vectors, each 
of which has a minimal failure magnitude). Preliminary results in this regard 
suggest, roughly' speaking, that a transformation of residuals which tends to 
orthogonalize the failure directions will allow decision rules to be designed 
for "minimal" failures with performance guaranteed for all failures which are 
"larger" than minimal. 

4. Finally, we note than in Appendix A many connections to other 
technical areas (time series analysis, pattern recognition, reduced order 
modeling) were made. A complete clarification or unification of these areas 
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in terms of the goals of FDI algorithms is considered useful in obtaining 
better insight, and in deriving alternative concepts which may be useful for 
further research into robust FDI. 
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ABSTRACT 


A decentralized failure detection and isolation (FDI) methodology, which 
is robust with respect to model uncertainties and noise, is presented. Redun- 
dancy metrics are developed, and optimization problems are posed for the 
choices of robust parity relations. Closed-form solutions for some special 
failure cases are given. Connections are drawn with other disciplines, and 
the use of the metrics to evaluate alternative FDI schemes is discussed. 
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A-l. INTRODUCTION 


A- 1.1 BACKGROUND AND MOTIVATION 

Sensor failure detection and isolation (FDI) deals with the problem of 
detecting deviations from normal behavior (which we will call "failure modes”) 
in a specified sensor complement and isolating the particular instrument that 
has failed. In recent years, numerous approaches have been developed to per- 
form FDI in linear, dynamic and stochastic systems. These methods include the 
voting schemes [1],[4], the generalized likelihood ratio (GLR) technique [3], 
[4], the multiple model (MM) approach [5], [6] and the detection filter scheme 
[7], [8], [9]. Willsky [10] has provided a comprehensive survey of the various 
FDI methods and a discussion of the strengths and weaknesses of each method. 

In general, all FDI methods use, either implicitly or explicitly, redun- 
dancy relations among the measured system variables to generate residuals (or 
generalized parity checks) and make failure decisions. In addition, all FDI 
methods can be conceptualized as consisting of three separate, clearly identi- 
fiable stages: 1) residual (parity) generation, 2) information collection and 

3) decisionmaking (see Fig. A-l-1). Essentially, the parity checks are a set 
of signals that should be near zero when the system is operating normally, and 
that will deviate from zero in characteristic ways when particular failures 
occur. The residual (parity) generation process can be of varying complexity 
for different FDI systems. For example, in a voting scheme parity checks are 
simply the differences of outputs from identical sensors, whereas in the GLR 
system, the parity checks are generated by the more complex Kalman filter (KF). 
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Figure A-l-1. Three Stage Structure of the FDI Process. 


In the information collection phase, the parity checks are accumulated 
over time (typically over a moving time window) for the purpose of establish- 
ing the presence of failures. This phase may produce one or several scalar 
decision statistics, which are then compared with a threshold in the decision 
phase to make failure decisions. The scalar statistics in most FDI systems 
are probabilities or likelihood ratios for distinguishing among a set of hy- 
potheses on residual behavior (where each such hypothesis corresponds to a 
particular failure mode). These statistics are typically generated in one of 
two ways ... via coherent or incoherent processing ... as shown in Fig. A-l-2. 
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Figure A-l-2. Information Collection Methods 
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In the coherent processing, one correlates the observed residual sequence 
with a specified function of time and then squares the resulting quantity. 

Here, what one, in essence, assumes is that one knows the sign history of each 
residual component and any important phase relationship among residual com- 
ponents under any particular failure mode. The prototypical example of this 
is when a failure manifests itself as a slowly-varying bias or drift in a 
particular residual component. Coherent processing in this case amounts to 
weighted summing of measurements followed by squaring. 

In the incoherent processing, one first computes particular weighted 
squared sums of the residuals and then sums the quantities over time. In this 
case, one is essentially assuming that a particular linear combination of the 
residual components is large at every point in time but that the temporal 
variation of this linear combination is unknown. Such a structure is of use, 
for example, in detecting noise variance changes. 

The tradeoff between coherent and incoherent processing is relatively 
clear. In the coherent case, we are matching much more closely a particular 
temporal variation and therefore will reject a larger class of other varia- 
tions. Thus, if we really know this variation (i.e., the sign history of resi- 
duals under a particular failure mode), one would expect a superior detection- 
false alarm performance using the coherent processor. If, on the other hand, 
the actual variation pattern under failed conditions is significantly different 
from the assumed one, the coherent processor will not perform nearly as well, 
and one may have to settle for the more conservative detection-false alarm 
capabilities of the incoherent processor. 
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The decision rule in an FDI algorithm normally involves comparing the de- 
cision statistics with one or more thresholds. The thresholds are determined 
to optimize a detection criterion such as the Bayes' risk, probability of 
error, etc. 

Since analytic model-based redundancy is the key to any FDI system, the 
robustness of any such system depends on the reliability of the redundancy 
(parity) relations that are used, given the inevitable presence of model 
uncertainties, nonlinearities, and noise. That is, the problem of robust 
failure detection and isolation is concerned with designing the parity gener- 
ation and information collection phases of the FDI system so that the result- 
ing decision statistics are "maximally sensitive" to the failure modes and 
"minimally sensitive" to model uncertainties and noise. This perspective was 
at the heart of the F-8 sensor FDI project [11], and the more recent theor- 
etical work reported in [12], [13]. 

A-1.2 FDI DESIGN PHILOSOPHY 

The design of a practical FDI system requires consideration of several 
issues : 

1. Performance of the detection system as measured by the 
detection time and detection accuracy (correct detections 
and false alarms). 

2. Robustness, i.e., relative insensitivity of the FDI system 
performance to parameter variations and modeling errors or 
uncertainties. 

3. Computational complexity as measured by storage requirements 
and computation time. 

One could imagine formulating an overall robust FDI system design problem 
which uses detailed dynamical models and which accounts for model uncertain- 
ties. This alternative, however, is of conceptual value only as such a problem 
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is far more complex than can be solved practically as well as being more com- 
plex than is necessary to obtain a practical design. A second approach - 
which, in some sense, is the default approach used in most top-down attempts 
at FDI design - is to synthesize an FDI system based on a nominal model, 
neglecting model uncertainties. One then evaluates FDI performance in the 
presence of model uncertainties and then modifies the design to improve per- 
formance e.g., one adds hypotheses corresponding to model errors in order to 
alert the FDI system to the fact that such uncertainties must be distinguished 
from failures. This more or less trial and error approach has obvious draw- 
backs in terms of providing very little in the way of a concrete methodology 
and, more importantly, the designs obtained in this manner are often unwieldy. 
This is because various "bells and whistles” are generally added to alert the 
FDI system to the fact that part of the system may be in error. Consequently, 
it is not a simple manner to check all possible "what-if" scenarios in order 
to troubleshoot possible robustness problems. 

The third approach to robust FDI system design is motivated by a desire 
to overcome the limitations just mentioned by taking uncertainty into account 
at the start in order to obtain as simple an FDI system as possible and that 
makes optimum use of the best information available (here simplicity is impor- 
tant not just for computational reasons but, more importantly, because simply 
structured systems are far easier to troubleshoot in order to pinpoint poten- 
tial weakpoints and performance-limiting issues). This approach, which can 
be thought of as the decentralization of the FDI process, is based on the con- 
cept of identifying only those parts of the system model that are known with 
greater certainty and then design the system based primarily on the certain 
parts. The key idea here is that in such a design one can focus information 
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for a particular failure mode using only those parts of the model which pro- 
vide the best "failure signature-to-model uncertainty ratio" for this failure. 
This decentralized approach is in marked constrast to a centralized FDI design 
based on one complete model, say using a Kalman filter, in which failure in- 
formation and all sources of model uncertainty are diffused through the entire 
set of residuals. Consequently, when information collection (as in Fig. A- 
1-2) is performed for such an approach, one will, in general, be mixing to- 
gether high quality and low quality information and the result may be a loss 
in robustness. Another major advantage of the decentralized approach is that 
it results in an FDI system in which the computations required to detect and 
isolate individual failures are typically far simpler than in a centralized 
approach. Furthermore the decentralized structure allows greater flexibility 
in designing an architecture for its implementation than a centralized FDI 
design. Finally, because parity relations are explicitly chosen for partic- 
ular purposes (e.g., detecting a particular failure), detection logic is ex- 
tremely simple. 

For the reasons just outlined we have taken as the basis of our approach 
the design of robust residual generation and information collection systems. 
This approach was, in essence, used in the successful F-8 sensor FDI project 
[11] and in the theoretical work by Chow and Willsky [12] and Lou, Willsky, 
and Verghese [13] which formed the foundation upon which the work in the 
present project has been built. 

A-1.3 OVERVIEW OF THE DESIGN APPROACH 

In the next two sections, we describe the major analytical constructs 
that are needed in our approach. We preface that development here with a 
brief overview of the major concepts behind the approach. The first part of 
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our design methodology is to identify those portions of the system dynamics 
that are known with the most certainty, as the use of parity checks based on 
these portions of the dynamics will be of great value in minimizing false 
alarms. Thus the first problem is to obtain a rank-ordered list of parity 
relations, where the ranking is in terms of some “robustness metric" that 
quantifies how close to zero each parity check is under normal conditions 
given the presence of model errors and noise. 

The second problem is coverage - that is, assessing the ability of 
relations identified in the first step to detect a specified set of failure 
modes (each of which is also modeled with an allowance for errors in the model 
used) and the possible augmentation of this set with additional relations that 
are useful in detecting particular failure modes not well-covered by the 
initial group of relations. Let us note two aspects of this problem. The 
first is that it requires a second metric that measures the ability of a 
particular relation to distinguish between normal conditions and a particular 
failure mode (i.e., to give decidely larger values under the particular 
failure than under normal conditions) given the presence of model error and 
noise. Again this metric can be used to rank-order relations with respect to 
their usefulness for detecting particular failures. The second point is that 
the relations added at this stage are less reliable than the first ones ob- 
tained, in that they may have larger values when no failure has occurred. 

They may be needed, however, to achieve the desired coverage (i.e., to achieve 
a specified probability of detection for all failures). Note also that the 
metric used should provide a guide to the minimum magnitude of a particular 
failure that can be reliably detected (i.e., to achieve a specified probabil- 
ity of false alarm). 
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The final step in the design is concerned with the problem of distin- 
guishability : given that a failure has occurred, can we determine which 

failure mode has occurred. Here again we need a metric that measures the 
ability of a parity relation to distinguish a particular failure mode from an 
alternative set of possible events corresponding to one or more of the other 
failure modes. Note again that if additional relations ar needed, they will 
be inherently less reliable under normal conditions. However, at this stage 
we can, if necessary, avoid the impact such relations would have on false 
alarm rate by using a two-level structure. Specifically, the relations deter- 
mined in the first two stages are sufficient for detection of all failures, 
but may not be able to isolate all of them. Thus, one could use these rela- 
tions to detect and trigger the use of additional relations for isolation 
only. 

For the most part we have discussed the residual generation phase of the 
FDI system. However, the same ideas are also of value in designing the in- 
formation collection phase, i.e., in determining the data length required to 
achieve desired performance levels and in specifying the details of how suc- 
cessive residuals are accumulated. In the next two sections, we describe 
the metrics we have considered and indicate how they are used in the design 
of robust FDI systems. We also provide a brief discussion of the interpreta- 
tions of parity relations as reduced-order models and the use of our metrics 
to assess the robustness of other FDI systems. 
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A-2. ROBUST PARITY (RESIDUAL) GENERATION METHODOLOGY 


As we stated in Section 1, the key to robust FDI is the "pulling apart" 
of the most reliable of the available redundant relations among the measured 
variables. In this section, we provide a precise definition and a unified 
view of redundancy, and develop a methodology for generating robust redundancy 
relations in the presence of model uncertainties. We will begin our discus- 
sion of redundancy with static perfectly known systems with no noise. 

A-2. 1 REDUNDANCY AND PARITY RELATIONS: STATIC PERFECTLY KNOWN SYSTEMS WITH 

NO MEASUREMENT NOISE [14], [15] 

Suppose that the redundant measurements can be modeled by the measurement 
equation: 

y = Cx (A-2-1) 


where y is an m vector of measurements, C is the mxn measurement matrix of 
rank r and x is the n vector of state variables. In static systems, one can 
interpret redundancy relations in four equivalent ways as follows. 


A-2 .1.1 Projection Matrix Interpretation 

The unique minimum norm, least squares solution of Eq. A-2-1 is given by 


x = 



(A- 2-2) 


where C^ is the generalized (Moore-Penrose) inverse of C. We can view the 
estimate in Eq. A-2-2 as the best linear unbiased estimate with infinite prior 
covariance for x. The residual vector, p is given by 
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(A- 2-3) 


P = y - Cx = (I m ~CC*)y = P c y 


where P c is the mxm projection matrix, with the following properties: 

1, P c spans an (unobservable) subspace that is orthogonal to the 
(observable) subspace spanned by the rows of C. That is, 

P .C = fl -CC^ )c = C - CC^C = 0 (From Moore-Penrose inverse 
c ' m ; 

property) . (A-2-4) 


Also, the residuals satisfy the relation: 

p = P^ y = 0 +x matches the measurements y exactly . 

2. Rank of P c = m - rank of (C) = m - r. 

3. The rows of the projection matrix, P c provide m-r=t independent , 
algebraic relationships among the m measured variables. Since 
these relationships are algebraic, we call them parity checks of 
order 0 (or 0-th order parity system). The t independent rows of 
P c are the t parity vectors of the 0th order parity system. Note 
that the parameter synthesis approach of [16] is a 0th order 
parity system. 


4. The projection matrix P c is idempotent and symmetric, i.e.. 



Consider the singular value decomposition (SVD) of the measurement matrix C 

C - U Z V T (A-2-5) 

where 
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U = mxm matrix of left singular vectors 
E = mxn matrix of singular values 
V = nxn matrix of right singular vectors 
The matrices U and V are orthonormal with the following properties: 

T T 

UU = U U = I 

m 

T T 

VV = V V = I 

n 


(A-2-6) 


When rank [C] = r, the SVD (Eq. A-2-5) can be rewritten explicitly as: 


n-r r n 
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where 


U 1 U 1 = *t 


U 1 U 2 = °t ,r 
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(A-2-8) 


Using Eq. A-2-8, 


T 

U X C = 0 (A-2-9) 


Eq. A-2-9 says that the t left singular vectors corresponding to zero singular 
values of C are the t independent parity relations: 


T T 

y ■ Dj y ■ Uj Cx - 0 


(A-2-10) 


A-16 




where y = [yiy2* * *bt] T « The columns of Ui are the t parity vectors 
u l» u 2»**** u t; we call the parity check matrix . A remarkable property of 
the parity vectors u^ is that they form an orthonormal set of vectors in the 
measurement space, i.e., they are as separate as possible. 


u^^ Uj = 0 for all i * j 


(A-2-11) 


Remark 


The projection matrix, P c and the t left singular vectors, are related 


by 


P = U, U. . 
c 11 


This can be seen as follows : 


t T T T -1 T T 

P = I - CC = U, U, + II U„ - IL T V„ V„ U = U, U 
cm 11 22 222222 11 


since 


c< - V 2 J* 1 U 2 T and U 2 . I r . 


Example 2.1 ; Consider a dual redundant sensor system where signal of sensor 2 
is related to the signal of sensor 1 by a scale factor a. That is, 


y 



C x 


Rank [C] = r = 1 


a. Projection Matrix Approach 



1 + a 2 1+a 2 


t 

P = I - CC 
c 


1 + a 2 1+a 2 

-a 1 

1 + a 2 1+a 2 


Rank [P] = m-r =1 . 
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Thus, P c defines a single parity check equation 


Ui = a yi -yz = 0 . 


b. SVD Approach 

The SVD of C is 



_ — 


a 

* 1 



l 


/l+a 2 

. /l+a 2 


0 

c = 


-j 


• 



a 


-1 

* a 


/l+a* 


- - 


/l+a 2 

. /l+a 2 



T 

U Z V 1 . 

The left singular vector corresponding to the zero singular value is 


U 1 


a 

/l+a 2 

-1 

/TT^ 


The parity check equation u^ y = 0 becomes : 



T 

It is easy to see that P = u.u. . 

c 1 1 


A-18 



A-2.1.3 Eigenvector Interpretation 

Another interpretation of parity vectors, that will be useful in later 
sections, is that they are the eigenvectors of CC^. This can be seen by con- 
sidering CC T in terms of the SVD as: 

T T T 2 T 

CC = U E V VEU = U E U 

T 2 

+ CC 0 = U E 

T 2 

C C u^ — u^ I i— 1 ,2, . . . ,m • 

Thus, the left singular vectors u^ are the eigenvectors of CC^ and the parity 
vectors are the eigenvectors of CC^ corresponding to zero eigenvalues. Since 
tr(CCT) i s a measure of energy in the measured signals, parity vectors can be 
thought of as the directions of least energy . This observation leads us to a 
stochastic interpretation in terms of minimum entropy, whenever the state x is 
random. We explore this interpretation next. 

A-2.1.4 Minimum Entropy Interpretation 

Entropy is a statistical measure of uncertainty [17]. As we will now 
argue, the entropy concept can be used as a suitable criterion for parity 
selection. In particular, a parity check can be thought of intuitively as a 
variable which, under unfailed conditions, takes on as close to a specified 
deterministic value (typically zero) as possible. The less randomness there is 
in a parity check, the more useful it is for detecting failures. Thus, if we 
view entropy as a measure of uncertainty, a meaningful parity selection crite- 
rion is to choose parity relations which minimize the entropy (uncertainty) of 
y. Since this criterion is equivalent to minimizing variations of y in the 
transformed domain (i.e., parity space), it is reasonable to expect that the 
transformed measurements (i.e., parity relations) will have clustering 
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properties (i.e., in example 2.1, parity check y = y2 -ayi = 0, is clustered 
around the origin). 

To prove the minimum entropy interpretation of parity relations and to 
provide connections to the eigenvector interpretation, assume that x is 
Gaussian with zero mean and covariance, E , i.e., x ~ N (0,E). We assume that 
E > 0. The basic idea of minimum entropy approach is to select a parity 
transformation 

V = G y (A— 2-12) 

such that the entropy of parity relations, y is minimized. In Eq. A- 2-12, 
y is an m vector, y is a t vector of desired number of parity relations and 
G is an m by t parity check matrix. To avoid trivial solutions G=0, we impose 
the orthonormality constraint on G: 


T 

G G = I 


(A-2-13) 


Note that the covariance of y is CEC T , and this will be invertible if and 
only if C has full row-rank. In this case there are no perfect parity checks 
as discussed in the preceding sections (where y = 0 under unfailed condi- 
tions), but what we wish to choose are parity checks so that the uncertainty 
in y is as small as possible. If C has rows that are linearly dependent, then 
there are perfect parity checks, as discussed in the preceding subsections, 
corresponding to the eigenvectors associated with the zero eigenvalue of CEC T . 
If one wishes to construct additional parity checks, then one again must set- 
tle for some level of uncertainty. These can be constructed using the proce- 
dure we now describe but first removing the dependent rows of C, thereby 
producing a matrix C with the same rank as C. For these reasons, in the re- 
mainder of this section, we assume without loss of generality that the rows of 
C are linearly independent. 
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The entropy of y is given by 

H u (t) = _ / p( U) fci p( p) dp (A-2-14) 

y 

where we have indicated the explicit dependence of on the number of parity 

T T 

relations, t. Since y is Gaussian with zero mean and covariance matrix G CSC G, 
H^(t) = - in(2rre) + - in det (g T CZC T g) . (A-2-15) 

Thus, entropy Hy(t) is a function of the covariance function of y. The opti- 
mum transformation matrix G is given by the following theorem. 

Theorem 2.1 

The entropy function is minimized by taking the columns of G to be the 
t normalized eigenvectors associated with the smallest t eigenvalues of the 
covariance matrix of y, CIC T * 

Proof : 

Entropy Hy(t) in Eq. A-2-14 is minimized with respect to G by forming the 
augmented Lagrangian function: 

H p = - ( in2 ire ) + - in det (g T CEC T g) + - tr [ (G T G-I t ) *T ] (a-2-16) 

where T is the matrix of Lagrange multipliers. Near optimal G* we have, for 
small AG 

'1 *TT 1 , *t T * \ 

AH^ = - in det(G +AG) CEC (G*+AG) - - in det [G CEC G J 

+ " tr [((G*+AG) T (G*+ AG) - G* T g*) T] (A-2-17) 

2 
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Using (I + A) - * = I - A, and Jin det (I + A) = tr(A), we have the first order 
condition: 

t T T * *t X * —1 ★ 

AH^ = tr [AG {CEC G (G CEC G ) — G r}]=0 (A-2-18) 

(Here, we are using the fact that G*^CECTg* is invertible. This will be the 

case if C has full rank.) In order to satisfy Eq. A-2-18 regardless of AG, we 
need 

T * *T T * -1 * 

CEC G (G CEC G ) = G T . (A-2-19) 

Premultiplying Eq. A-2-19 by G* T and using G* T G* = I t , we have r = If Using 

T = I t , Eq. A-2-19 can be rewritten as 

T * * , * T T 

CEC G — G *(G CEC G J . (A— 2— 20) 

From Eq. A-2-20 it is clear that the columns of G* are the eigenvectors of 
CECT (in this case note that G*CEC T G* T is diagonal), and thus G* whose columns 
consist of t eigenvectors of CEC T is a stationary point of the entropy func- 
tion. It is immediate, then, from Eq. A-2-15 that the minimum is achieved by 
choosing the eigenvectors corresponding to the t smallest eignevalues of CEC^. 

Remarks 

1. The minimum entropy result we just obtained provides us with a con- 
venient interpretation in terms of eigenvector and eigenvalue analysis of the 
covariance function of y: the parity vectors are the eigenvectors of the 
covariance function of y corresponding to the smallest eigenvalues. This 
observation, and the minimum entropy interpretation enables us to extend the 
parity space concepts to dynamic and stochastic systems. More importantly, 
this result says that we can work directly with the experimentally obtained 
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covariance data to generate the parity relations , if a model of the system is 
not available . 

2. Since eigenvectors corresponding to the smallest eigenvalues of the 
covariance matrix are the directions of "near perfect" correlation, minimum 
entropy transformation provides the "most certain" or deterministic relation- 
ships among the measured variables, i.e., relations with least variability or 
relations having clustering properties. Thus, the eigenvalues associated with 
parity vectors are measures of their certainty (or "robustness"). 

3. The use of minimum entropy criterion to feature selection process 
in pattern recognition literature is well known [17]. The criterion is used 
because of the clustering effects it produces on the data, i.e., the parity 
check values are as close to a specified set of deterministic values as 
possible. 

4. Note that if x is equally likely to be in any direction, i.e., E«al, 
the parity vectors correspond to eigenvectors of CC T *»*, i.e., when we have no 
a priori information about the relative scaling of components of x, we recover 
the results of the preceding subsections. 

A-2.2 REDUNDANCY AND PARITY RELATIONS: STATIC PERFECTLY KNOWN SYSTEMS WITH 

MEASUREMENT NOISE. 

Consider a measurement system of the form 

y = Cx + v (A— 2-21 ) 

where x is a Gaussian state vector with zero mean and covariance, £; and v is 
a zero mean, Gaussian noise process with covariance R, which is uncorrelated 
with the original process, x. 

As before, we seek a parity transformation matrix G to minimize the 
entropy of parity relations, y = G T y given by 
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(A-2-22) 


^ £n (2 ire) + - in det [G T (CEC T + R) G ] . 

As before the entropy function is minimized by forming the transformation 
matrix G from the t normalized eigenvectors associated with the smallest 
eigenvalues of the covariance matrix, (CEC^+R). 

Remarks 

1. The minimum entropy orthonormal transformation G also minimizes the 
scatter (dispersion measure) given by 

d y 2 = tr ( E = tr ( gT < cz:cT + R > G ) . (A-2-23) 

2. In the statistical literature, the use of eigen analysis to charac- 
terize features is also referred to as factor analysis or principal component 
analysis (PCA) [18]. PCA has been used in model order reduction problems [19], 
system identification [20], and image processing [21]. 

3. A nice property of the parity relations u is that they are component 
wise uncorrelated, i.e., 

E = 0 for all i # j , i,j = l,2***t . (A-2-24) 

Under Gaussian assumption, the components are independent as well. 

4. When measurement noise is present, the minimum entropy transformation 
G provides parity relations that are as orthogonal as possible to both the sig- 
nal and the noise processes. 

5. In pattern recognition literature [17], [22], it is a common practice 
to use the autocorrelation function of the measurement process y, which is also 
the covariance matrix of y under the zero mean assumptions on x and v. The 
autocorrelation function approach has the important interpretation that the 
parity vectors are the directions of least energy. 
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A-2.3 REDUNDANCY AND PARITY RELATIONS: STATIC UNCERTAIN SYSTEMS WITH 

MEASUREMENT NOISE 

Consider a static, uncertain measurement system 

y = c £ x + v £ (A-2-27) 

where the measurement matrix C^ takes on values from a finite set of L models, 
£ = 1,2, is a zero mean, white-Gaussian noise process with covariance 
Rj£ under the model hypothesis £; x is an n-dimensional state vector with zero 
mean and covariance E^ under model hypothesis £. In addition, the a priori 
probability that the £-th model is correct is, P^. Note that the probability 
density of y is a sum of L Gaussian densities: 

L T 

P(y) - £ p * N(0,C Jl E Jl C i + R t ) . (A-2-28) 

£— 1 

Our objective is to find the minimum entropy parity transformation such that: 

T 

V m G y (A-2-29) 


subject to the normalization constraint: 

T 

G G - I t . 


(A— 2-30) 


The entropy of p for a given model hypothesis £ is 

H p/£ = 2 ( £n2lie ) + 2 d6t ( gT<< V* C £ T + R ^ )G ^ * (A-2-31) 

The average entropy of u over the range of model hypotheses is the weighted 
sum of conditional entropies given by Eq. A-2-31. That is, 

L 

H p (t) = P £ H y/£ (A-2-32) 

where we have shown the explicit dependence of H,, on the desired number of 
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parity relations t. Proceeding as in the proof of theorem 2-17, the minimum 
entropy parity transformation G* satisfies the nonlinear equation: 


* 

g r = 


L 

£ 

£= 1 



-1 


(A-2-33) 


where T is the matrix of Lagrange multipliers. Pre-multiplying Eq. (A-2-33) 

T T 

by G* , and noting that G* G* = I t , we have T = I t . Thus, the optimal parity 
check matrix G* satisfies the nonlinear matrix relationship: 


* 

G = 


L 

£ 

£=1 


(c . £ .C „ T + R J G* 
v £ £ £ £' 




(A-2-34) 


A simple successive substitution scheme can be employed to solve for the 
optimal minimum entropy parity transformation matrix. At iteration i, the 
idea is to find g(*) via * 


,(i> „ 


L 

£ 

£=1 


Cc, 


£C f T +R( )G^ 
£ £ V 


[G (i ‘ 


1)T 


(c £ c T + r o )g (1 

^ £ £ £ V 


- 1 ) 


r 1 


(A-2-35) 


The algorithm starts with an initial g( 0) and iteratively updates till 
DG(i) - G( i_1 ) H is small with respect to some suitable norm. However, if we 
approximate the density function of y (or p) , which is a sum of L Gaussian 
densities with a single Gaussian density having the same mean and covariance 
an important simplification arises. That is, we approximate 


L 

£ 

£=1 


p £ N(0, + R £ ) ~n(0,c) 


(A-2-36a) 


♦Alternatively one can use conjugate gradient method to solve for G using 
a modified form of Eq. 2-34 as the descent direction. 
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where 


C = 


L 

Z 

1 = 1 


(c 


T 

n Z„C„ + 

i i a 


R J 


(A-2-36b) 


With this assumption, we can state Theorem 2.2. 

Theorem 2.2 Assumption 2-36 implies that an optimal choice of G is the t 

sv /s, 

eigenvectors of C corresponding to the smallest t eigenvalues of C. 

Proof: Same as Theorem 2.1. 


Example 2.2 

Consider the dual redundant measurement system with three uncertain 
measurement matrices 


1 


1 


1 


n 

IVJ 

11 


0 

U> 

II 


1 .2 


1 


.8 


P 


1 




R 1 - R 2 - R3 


.1 0 
0 .1 


E 4 -l. *=1,2,3 


Using the Gaussian sum approximation, 


C 


1.1 l 

1 1.13 


Minimum eigenvalue » .12 

T 

Eigenvector = [.714 - .7] 


The robust parity vector is illustrated in Fig. A-2-1. 
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Figure A-2-1. Illustration of Robust Parity Vector in Static 
Uncertain Systems 
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A-2.4 REDUNDANCY AND PARITY RELATIONS: LINEAR DYNAMIC AND STOCHASTIC SYSTEMS 

WITH MODEL UNCERTAINTY 

A-2 .4.1 Problem Formulation 

Consider a linear time-invariant, discrete-time stochastic system model 


x(k+l) = A^ x(k) + D£ w(k) (A-2-37) 

y(k) = C* x(k) + v £ (k) (A-2-38) 


where the system parameters {A£, D£, C£} take on values from a finite set of L 
models, 1=1,2, •••,L ; x is an n-dimensional state vector with zero mean and 
steady-state covariance matrix, l£ under the model hypothesis Z ; y is the 
m-dimensional output vector; w is an dimensional, zero-mean, white-Gaussian 
noise process with covariance matrix Q ; and V£ is an m-dimensional, zero- 
mean, white-Gaussian noise process with covariance matrix R£. In addition, 
the a priori probability that the 1-th model is correct is P£ . 

Following [12], [13], we define a parity check of order p as a linear com- 
bination of the lagged and present values of the output over a window of size 

p such that the parity checks have small values if no failure occurs. Let 
T 

P (k) = [vij(k), , •••» P^(k) , •••, Pj.(k) ] denote a t-vector of such 

parity relations. Then, the parity relation p^(k) is related to the lagged 
and present values of the outputs by: 

Pf(k) - g i T Y p (k) ; i=l ,2 , • • • ,t (A-2-39) 


where Y p (k) and g.^ , the i-th parity vector, are 


Yp(k) = (yCk-p), y(k-p+l) > •••, y(k) ) 


X A 

®i = UiO* Sil’ Sip) • 


(A-2-40a) 

(A-2-40b) 
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We also write 


U(k) = G Y (k) 


(A-2-41a) 


where 


G = [g lt g 2 , g t ] 


(A~2~41b) 


The window of outputs Yp(k) satisfies a linear relationship of the form: 


Y (k) = M x(k-p) + II . w (k) + v (k) 
p p£ v pZp pi 


(A-2-42) 


where w p (k) and v p £(k) are lagged and present values of the process noise and 
measurement noise processes over a window of size p: 


w (k) = [w(k-p) , w(k-p+l), w(k) ] 


(A-2-43a) 


v pJt (k) = [v A (k-p), v^Ck-p+1), •••, v £ (k) ] 


(A-2-43b) 


M p £ is the p-th order partial observability matrix given by 


A 

Mp l “ 




L C * A i J 


a (p+l)m by n matrix 


(A-2-43c) 


H p £ is the extended process noise matrix of the form 
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p* 


C £ »£ 
C £ A £ D £ 


0 0 


0 0 


C £°£ 0 


0 0 

0 0 
0 0 


L C £ A £ D £ 


• • • i 


c i°i °j 


a (p+l)m by (p+lji^ matrix 


(A-2-43d) 


Note that the probability density of Yp(k) is a sum of L Gaussian densities: 


p(Y (k))= £ P N(0, C ) 

H £=1 


where C is the autocovariance function (ACF) of Y conditioned on the fact 
£ p 

that the model hypothesis is £. In the steady state, is given by 


^£ = M p£ E £ M £ + n p£ Q p E p£ + R p£ 


(A-2-45) 


where 


■= diag (R^) a (p+l)m by (p+l)m matrix 


Qp = diag (Q) a (p+l)n w by (p+l)n w matrix 


E. = A E A T + D QD^ 
£ £ £ £ £ £ 


(A-2-46a) 

(A-2-46b) 

(A-2-46c) 


Using Eq. A-2-46c, repeatedly, it is easy to see that is a (p+1) by 


(p+1) block Toeplitz matrix of the form: 
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c o* 

c u • - 

V 

c u 

• 

• 

£ 0l 

• 

• 

c p-l,* 

• 

• 

• 

- 

• 

A. 

c , • • • 

p-1, i 

• 

A. 

C 0£ _ 



c * £ i c t * R r 1 -° 

' 

c £ A 1 E £ C £ T ; i=l ,2, ,#, ,P 


(A-2-47a) 


(A-2-47b) 


where E^ is the steady-state covariance matrix of the state x(k). 

Our objective is to find the t best parity relations that are maximally 
insensitive to model uncertainties and the process and measurement noises. We 
discuss two robust redundancy metrics that accomplish this objective. 

A-2.4.2 Covariance Based Robust Redundancy Metric 

In [13] a "robust redundancy metric" is proposed that is a weighted 
average of the trace of the covariance of p(k) under various model hypotheses, 
£=1,2, : 

L T 

J(p,t) = E P tr [E |p(k) p (k) } ] (A-2-48) 

*=1 

where denotes expectation conditioned on the £-th model hypothesis (note 
also that x(k) , w(k), v(k) , and hence p(k) are zero mean). In Eq. A-2-48, we 
have indicated the explicit dependence of J on the order of the parity 
relations, p, as well as the desired number of parity relations, t. Using 
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Eq. A-2-41a and performing some algebraic manipulations, we have the following 
minimization problem for the optimal choice of the (p+l)m by t parity 
transformation matrix G : 


min J(p,t) = min tr (G^CG) (A-2-49a) 

G G 


subject to 


T 

G G 


(A-2-49b) 


The constraint in Eq. A-2-49b is 
In Eq. A-2-49a, C is the average 
£=1 ,2 , • • • ,L and is given by 


included to avoid the trival solution of G=0. 

ACF of Y over the set of uncertain models, 

P 


L 

C = I P £ C £ (A-2-50) 

1=1 

where C is the ACF of Y conditioned on model £, defined in Eqs. a-2-45 - 
x, p 

A-2-47. Th e optimum transformation matrix G is given by the following 
theorem. 


Theorem 2.3 : The robust redundancy metric J(p,t) is minimized by taking the 

columns of G to be the t normalized eigenvectors associated with the smallest 
t eigenvalues of the average ACF C of Eq. A-2-50. 

Proof ; Form the augmented Lagrangian function: 

j'(p,t) = tr [G T CG + (G T G - I t ) r] (A-2-51) 

where T is the t by t matrix of Langrange multipliers. The graident of j' 
with respect to G is given by 

V' = 2 (CG + GT) (A-2-52) 
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At the optimum G*, the gradient is zero. Using this condition and the 
constraint Eq. A-2-49b, we have 


r = 



Using Eq. A-2-53 in Eq. 


A- 2-5 2, we have 


^ * * . *T«. * 

CG = G (G CG ) 


(A-2-53) 


(A-2-54) 


Equation A-2-54 implies that the optimum parity transformation matrix G* is 
formed by taking the columns of G* to be the t normalized eigenvectors 
associated with the smallest eigenvalues of the average ACF C. 

The results of Theorem 2.3 provide us with an important interpretation of 
parity relations. Since the parity vectors g£ are the eigenvectors of C asso- 
ciated with the minimum eigenvalues, the parity checks wi(k) of order p cor- 
respond to "near perfect" correlation among the output variables over a window 
of size p . In addition, since the eigenvalues are measures of signal energy 
along the directions represented by the eigenvectors , the parity vectors gj 

can be thought of as the "directions of least output energy." Moreover, if 
the eigenvalues X^ of C are ordered according to size 


X 1 < X 2 < < X (p+1)m (A-2-55) 

then with the columns of G* chosen as the eigenvectors corresponding to the 
smallest t eigenvalues, the optimum values of the robust redundancy metric is 




t 

I 

i= 1 


(A-2-56) 


That is, the first column of G* , g^, is the most robust parity check and Xj is 
its measure of robustness, g 2 is the next best parity check with X 2 as its 
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measure of robustness, and so on. Thus J*(p,t) has the interpretation as the 
overall robustness measure, if one were to use the t best parity relations of 
order p. Since the are increasing, the general shape of J*(p,t) as a 
function of p and t is as shown in Fig. A-2-2. What this curve provides us is 
a summary picture of the level of robustness in a particular set of parity 
relations. Intuitively, for a given p, a good choice of t would be near the 
"knee” of the curve (i.e,, at values of t at which begins to increase 

more dramatically), although the value of t also depends on the number of 
failure modes that should be detected and isolated (see Section A-3). 



Figure A-2-2. Illustration of the Robust Redundancy Function 
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Another potential use of the robust redundancy curve is in the design 
process. Specifically, different sensor complements (i.e., different choices 
of the measurement matrix in Eq. A-2-38) will in general have different re- 
dundancy characteristics. By comparing their robust redundancy curves one can 
determine a useful measure for the relative merits of alternative sensor sets 
in terms of the likely FDI algorithm performance that would result in each 
case. Also it is worth noting that in the development in [13], the parity 
vectors are interpreted as the left singular vectors of a scaled and extended 
observability matrix rather than the eigenvector interpretation presented here. 
Our interpretation of parity check generation in terms of eigen-analysis of 
the average ACF function allows us to work with experimentally-obtained 
covariance data , if a model of the system is not available. In addition, as 
will be discussed in the following, our results provide an important link 
between parity check generation problem and the results in autoregressive (AR) 
spectral estimation [ 23 ] — [ 25 ] , system identification [20], model order reduc- 
tion [19] and the feature extraction problems in pattern recognition [22]. 

A-2.4.3 Entropy Based Redundancy Metric 

An alternative metric for robust redundancy can be derived based on the 
concept of entropy. As discussed in Section 1, a parity check can be thought 
of as a signal that takes on values as close to a specified deterministic 
value (typically zero) as possible under normal system operation. The less 
randomness there is in the parity check, the more useful it is for detecting 
failures. Thus, a meaningful parity selection criterion is to choose parity 
relations ui(k) such that they have minimum entropy. We use a weighted 
average of the entropy under various model hypotheses £=1,2,*»»,L as our 
robust redundancy metric: 
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(A-2-57) 


V»-'> - P * \|t 

where | ^ is the entropy of y(k) conditioned on model £. Since p(k) is 
Gaussian with zero mean and covariance G C^G under the assumption of model £, 

H u|£ 18 glven by 

t i 

H u|£ = 2 £n (2ire) + j d6t ^ (A-2-58) 

Thus, the optimization problem is to minimize H y (p,t) in Eq. A-2-57, with re- 
spect to the parity transformation G , subject to the normalization constraint 
G^G^It* The result is given by Theorem 2.4. 


Theorem 2.4 : The entropy function H^(p,t) is minimized by a parity transfor- 

mation matrix G* that satisfies the nonlinear equation: 


* 

G 


L 

E 

£=1 


P„ C n 
£ £ 


* , *T<v * s-1 

G (G X C C ) 


(A-2-59) 


Proof : Same as Theorem 2.1. 

A successive substitution scheme can be used to solve for the optimal 
minimum entropy parity transformation. However, if we approximate the density 
function of Yp(k) (or p(k)), which is a sum of L Gaussian densities, by an 
equivalent single Gaussian density with the same mean and covariance, as in 
Eq. A-2-36 , an important simplification arises. That is, 

E (y(K) } = 0 (A-2-60a) 

cov {y(k) } = G T ( E C £ ) G = G T CG (A-2-60b) 
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The optimum parity transformation is given by the following theorem. 

Theorem 2.5: The Gaussian sum approximation implies that an optimal choice of 
G is the set of t eigenvectors of the average ACF C corresponding to the t 
smallest eigenvalues. 

Proof : Same as Theorem 2.1. 

The result of Theorem 2.5 says that the implicit probabilistic assumption 
made in the work of Lou, Willsky and Verghese [13J is precisely the Gaussian 
sum approximation (Eq. A-2-60) and that their metric provides robust parity 
relations with respect to the "average" model defined by the approximation in 
Eq. A-2-60. 

Example 2.3: The F-100 Engine Example 

In order to determine the optimally redundant parity vectors under normal 
(unfailed) conditions, we used the linearized system matrices A and C from the 
simplified nonlinear simulation of Ref. (26] at power lever angles (PLA) of 
60°, 65° and 70°. The control is assumed to be zero for simplicity. As will 
be illustrated subsequently, the effect of control on parity generation can be 
included into the parity generation technique in a straightforward manner. 

The three sets of system matrices A and C, shown in Table A-2-1, exhibit a 
significant change in parameters. In this example, state dimension, n=4 and 
output dimension m=5. The four state variables are: 

X 1 = Nj Fan speed (RPM) 

X 2 = N 2 Compressor Speed (RPM) 

X 3 = TT410 Burner exit slow response temperature (°K) 

X 4 = TT4.510 Fan turbine inlet slow response temperature (°K) 
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TABLE A- 2-1. SYSTEM MATRICES A AND C AS A FUNCTION OF POWER LEVEL ANGLES 
(PLA) 


PLA-degrees 

Matrix 

Coefficient Values 



-5.133 

4.750 

WBM 



A 

-0.298 

-2.982 


: : '■ ; : 



-0.0034 

-0.00013 

-0.639 




.0284 

-0.0290 

-0.0413 

-1.918 

60° 








mm 


0.000 

0.000 





0.000 

0.000 


C 


1' 1 

-0.013 

-0.0124 





-0.023 

-0.0195 



■HI 

-0.447 

-0.226 

.0148 



-4,827 

4.749 

-0.310 

1.353 


A 

-0.0623 

-2.887 

1.195 

-3.131 



-0.0362 

-0.0013 

-0.639 

.0026 



.0007 

-0.029 
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The five sensor outputs are: 

yi = Ni Fan speed (RPM) 

Y2 = N2 (Compressor speed (RPM) 

Y3 = PT4 = Burner pressure (N/m^) 

Y 4 = PT6 = Augmentor pressure (N/m2) 

Y5 = FTIT = Fan turbine inlet temperature (°K) 

Our objective is to find a single set of 5 parity relations that are 
linear combinations of present and past sensor outputs, for the three dif- 
ferent operating conditions, PLA = 60°, 65° and 75°. If we can successfully 
design such a set for these three widely differing operating conditions, it is 
anticipated that the parity vectors for smaller uncertainties (approximately 
±3°PLA) around a single operating point will be substantially robust. As- 
suming = I and R^ = 0 for £ = 1,2,3 operating conditions, the optimal ro- 
bust redundancy function, J*(p,t) is computed for various values of t and 
p - 0,1, 2, 3 and plotted in Fig. A-2-3. It is clear that a complete set of 
five robust parity relations can be obtained by using parity orders of at most 
2. The parity transformation matrix, G associated with the 5 parity relations 
is shown in Table A-2-2. 
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Figure A-2-3. Robust Redundancy Function vs. the Number of Parity 
Relations for the F-100 Example. 


TABLE A-2-2 . THE PARITY TRANSFORMATION MATRIX G FOR THE F-100 EXAMPLE 
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2.4.4 Complements 


The parity check generation methodology discussed in subsections A-2.4.2 
and A-2.4.3 has important implications in autoregressive spectral estimation 
[23-25], system identification and model order reduction [19,20]. Some of 
these issues, as well as frequency domain interpretations of parity relations, 
are discussed below. 


Parity Relations as Reduced Order ARMA Models 

Recall from Eq. A- 2-39 that the component i of y(k) can be written as: 


^(k) 


i g^ T y(k-p+j) 

j=0 1J 



Y p (k) 


i=l,2, 


where g^ is column i of G . We say that the parity relation is strictly of 
T 

order p if g± p * 0 . Consider such a relation and denote the components of 
gip by 

g ip = feipl’ g i P 2’ g ipq * g ipm ^ • (A-2-61) 


Assuming, without loss of generality, that gipq * 0, then component q of out- 
put vector, y q (k), can be predicted from yq(k-l), yq(k-2), •••, yq(k-p) and 
other output components y r (k), y r (k-l), •••, yr(k-p), r*q using the i-th 
parity relation as: 


y (i) (k) 

q 



p-l p m 

1 g n a y p+J) + 1 1 « 11r y_( k ~p+j) 

j=0 1Jq q j=0 r=l iJr r 

r^q 


(A-2-62) 


where the superscript i denotes that yq is predicted using the i-th parity 
relation. There are two important interpretations of Eq. A-2-62. The most 
obvious is that the right-hand side of Eq. A-2-62 represents a synthetic 
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measurement of the q-th output that can be compared to yq(k) to generate (a, 
scaled version of) the parity relation u^(k) . Thus Eq. A-2-62 can be inter- 
preted as a prediction filter and leads to a generalization of the concept of 
parameter synthesis, discussed in [16] and successfully applied in [4], where 
the generalization stems from allowing the right-hand side of the prediction 
system in Eq. A-2-62 to have memory. The second interpretation of Eq. A-2-62 
is as a reduced order ARMA model for yq(k). That is, yq(k) is expressed in 
terms of its own lagged values and of the present and past values of the re- 
maining output components y r , r=l,2,*»*,m, r*q, which are treated as external 
inputs , and a noise process ”i(k), i.e.. 


y (k) = — 

gipq 


p-1 

E 

j=0 


g ijq 


y q (k-p+j) + 



m 

E 

r=l 

r*q 


g ijr 


(A-2-63) 


This method of obtaining reduced order ARMA models has similarities to the 
work in [ 19] , [20] . 


Parity Relations as Prediction Error Filters 

From Eqs . A-2-62 and A-2-63, it is clear that the parity relation Pi(k) 
can be regarded as a (weighted) one-step prediction error filter (PEF) of 
order p, since 


^(k) = g ipq [y q (k) -y^U)] . (A-2-64) 

Since the PEF has finite memory, ui(k) is also referred to as a finite memory 
(FM) parity check [12]. PEF interpretation of parity relations is shown in 
Fig. A-2-4. PEFs have been studied extensively in the AR spectral estimation 
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literature [ 23 ] — [ 25 ] . Our method of obtaining PEF coefficients is akin to 
Pisarenko's harmonic retrieval method discussed in [ 23 ] — [ 25 ] • However, unlike 
the approches of spectral estimation literature, the parity space approach 
does not fix the dimension of y(k) and, hence, the dimension of the PEFs a 
priori. The number of parity relations, t is chosen to minimize the cost 
function (Eq. A-2-48 or Eq. A- 2-57) while satisfying the FDI requirements (see 
Section 3). In addition, in computing the prediction error for yq(k), all 
other output components y r (k), r*q are treated as known inputs to the ARMA 
model. Finally, model uncertainty is taken into account explicitly in com- 
puting the average ACF that forms the basis for designing robust parity coef- 
ficients, whereas the covariance based PEFs treated in the spectral estimation 
literature assume a perfectly known model. 


Parity Relations as Moving Average (MA) Filters 

The window of observations Y p (k) can be written in the frequency domain 
as 

Y p (z) = AnCz) y(z) (A-2-65) 


where y(z) is the z transform of the output, y(k); and ^(z) is a (p+l)m by m 
matrix of delay elements given by 



(A-2-66) 


Then, the parity check vector y(k) can be written in the frequency domain as: 


T T 

y(z) = G Y p (z) = G (z) y(z) 

where 

G T (z) = G T A (z) 
m 


(A- 2-67) 


(A-2-68) 
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If we partition the (p+l)m by t parity transformation matrix G as 



where each Gj is an m by t matrix given by 


G j ~ [gjj g 2j 


»g t j ] ; o < j < p 


(A-2-69) 


(A-2-70) 


The m dimensional column vectors gjj , 1 < i < t are defined in Eq. A-2-61. 
Using the definition of ^(z) in Eq. A-2-66, we have 


p(z) 


I G T z' (p ' j) y(z) = G T (z) y(z) . 
j=0 2 


(A-2-71) 


Thus, the parity transformation is a set of t moving average (MA) filters of 
order p (in m dimensional space), as shown in Fig. A-2-5. 


m vector 


y(z) 



> y(z) 


t vector 


Figure A-2-5. Parity Transformation as a Moving Average (MA) Filter 
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A Frequency Domain Algorithm for Parity Transformation 


The MA filter interpretation provides us with a method for computing the 

parity transformation G in the frequency domain. In the steady state letting 
j (D , 

z = e , we have 


p(e JW ) 



(A-2-72) 


where GCe^ 111 ) is an m by t impulse response matrix. If we let m) be the 

average power spectral density (PSD) matrix (which is a real function of m) 
over all possible model hypotheses, then the autocovariance related criterion 
in Eq. A-2-48 becomes: 


J(p,t) = tr 


-1 2 IT 

[(2ir) / 4> u (oj) da>] 


(A-2-73) 


Equation A-2-73 is derived by using the property that the average auto- 
covariance function (ACF) of jj(k) , G CG and the PSD are Fourier trans- 

form pairs. The average PSD matrix 4>^(io) is given by 

L 

%(“> = Z P z (A-2-74) 

£— 1 

where $y£(ui) is the PSD of p(k) under model l and is related to the PSD of the 
output process 4>y£(u>) via 

*y £ <“> = GT ( eja> ) * y£ < w > (A- 2-75) 

The PSD of the output process is related to the system parameters under model 
1 and is given by 


4> .(“) ■ C (e^ uI - A ) _1 D Q D^ (e -ja)I - A ) 1 + R 
yl JT V Z Z i 


(A-2-76) 


Equation A-2-73 must be minimized subject to the constraint: 
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It is easy to see that the best choice for GCeJ^), (although not necessarily 
causal), at frequency <d is the set of t eigenvectors corresponding to the 
smallest t eigenvalues of the average PSD $y( to) given by 

L 

4> y (o)) = p z * y4 (ft>) (A-2-78) 


An approximate algorithm for obtaining a pth order MA filter is as follows: 

a. Divide the interval [0,2 it] into p equal segments, and let 


2ir 

oij = j • — ; j = 0,1 ,2 , • • *,p 

P 


b. Evaluate $ (to) for each i = 1,2,...,L and the t orthogonal eigenvec 

y * 

tors, G corresponding to the smallest t eigenvalues of the average PSD $ (to.). 

j y j 

Denote them by A^to^), i=l,2,*»»,t. Scale each eigenvector by A^ * (lo^). 

c. Normalize each column of G so that 


£ g.,.j T g,, = 1 5 i=l*2, 

J-0 J 


where gjj is the scaled version of the i-th column of Gj (an m vector). The 
resulting normalized Gj provide the desired MA filter coefficients. The pro- 
cedure is only approximate since Gjs are computed from only a finite set of 
to's. However, it has the advantage that the eigenvalue - eigenvector decom- 
position of an m by m matrix, G j , need be performed (p+1) times rather than 
the eigenvalue - eigenvector decomposition of a (p+l)m by (p+l)m matrix as in 
Eq. A-2-49. Since the approximation method is computationally efficient, it 



is not necessary to restrict the value of p to small numbers. An alternate 
method is to find Gj , j = 0,1,2, (N » 1) using the eigenanalysis and 
then perform inverse FFT to find the impulse response matrix G. One drawback 
of this approach is that the resulting G need not be nonant icipatory [20]. 

Another major advantage of the MA filter interpretation of parity rela- 
tions is that it provides us with a general method for generating parity vec- 
tors by weighting frequency ranges of interest. Suppose, W(z) represents such 
an m by m weighting filter so that the transformed output y(z) is given by 

y(z) = W(z) y(z) (A-2-79) 


For a given W(z), we want to find a pth order parity transformation (i.e., an 

T 

MA filter) so that y(z) = G (z) y(z) has minimum autocovariance or minimum 

entropy. It is easy to see that the eigen analysis of the average PSD of y(k) , 

$ (oj) at a). = j • for j = 0,1,2, "sp provides us with (approximate) p-th 

y j p 

order MA coefficients, G ^ , where $ y (u>) is given by 

3 y (ai) = W(e^ a) ) $ y (u)) W(e ju) ) (A-2-80) 


Generalization of Levinson Filter 

If, instead of G T G = It , we impose a constraint on the coefficients 
multiplying y(k) as 



(A-2-81) 


T 

G p = [g lp’ g 2p' * * *’ g tp^ (A-2-82) 

is an mxt matrix, then we obtain a generalization of Levinson filter. Note 
that the Levinson filter imposes the constraint Gp=I m . The constraint in 
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Eq. A- 2-81 says that we require the best t relations of the m vector output 
that minimizes the robustness metric in Eq. A-2-49. The optimal G is given by 
the following theorem. 


Theorem 2.6 

The robustness metric (Eq. A-2-49) is minimized subject to the constraint 

* 

in Eq. A-2-81 by choosing the columns of G p as the set of t eigenvectors of 
the matrix given by 


where 


T -1 

C A = C A - N 0 , N 

0 0 p p-1 p 


(A-2-83a) 


= [ Cp-!, C p _ 2 , Cj ] an m by pm 


matrix 


(A-2-83b) 


p-1 


/,k 5 T ^ T 

c o C 1 - Vi 


c c 

1 0 


- C p-1 C p-2 


• C 


P-2 


^0 J 


an mp by mp matrix (A-2-83c) 


/N. /V 


where Cq, , •••, C p are the average ACFs defined in Eqs. A-2-47 and A-2-50. 
That is , 


c i ■ , 1 , p « c it • 0 ‘ 1 < r 

X” x 


(A-2-83d) 


The remaining G^ are given by 




*T 


til- 


*T T -1 
-G N 0 A , 
P P P-1 


( A-2-83e) 
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Proof: The augmented Lagrangian function is 


J'(p,t) = tr [G T CG + (G T EE T G - i t ) r] 


where r is a t by t matrix of Lagrange multipliers and 


E = 


a (p+l)m by (p+l)m matrix 


0 0 ••• 0 
0 0 ••• 0 

0 0 ••• xj 


The optimality condition VqJ 1 = 0 at the optimal G* gives 


* T * * 

CG + EE G r = 0 


or in expanded form 


0 , N 

p-1 p 


T 

N C. . 
- P OJ 


G , 

p-1 


* 

l g p 


0 

0 


* * 
- G p r 


where 


**T a r *T *T * T 

VI ‘ [G 0 G i ••• VI ] 


Equation A-2-85b implies that 


'•if * 

G = - 0 , N G 

p-1 p-1 p p 


and 


N 


T 

P 




* 

G 

P 


* 

G 

P 


* 

r 


(A-2-84) 


(A-2-85a) 


(A-2~85b) 


(A-2-86) 


(A-2-87) 


(A-2-88) 
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T * 

Using the constraint Gp = I t , r is evaluated as 


* *T ** * 

r = - G C- G 

p 0 p 


(A-2-89) 


where is defined in Eq. A-2-83a. Equations A- 2-88 and A-2-89 provide the 
desired result that the optimal G^ is formed from the set of t eigenvectors 
of Cq corresponding to its smallest t eigenvalues. Once G^ is computed, G^_j 
is evaluated from Eq. A-2-86. It is worth noting that the matrix arises 


prominently in the inversion of block Toeplitz matrices [27]. 


A Recursive Expression for the Time Window of Measurements 

We can derive a recursive expression for Yp(k) from the system equations 
(Eqs. A-2-37 and A- 2-42) whenever the system is observable. When Mp£ has full 
column rank, i.e., rank [M p = n under model hypothesis i, then Eq. A-2-42 
can be rewritten as: 


x(k) = T [Y (k) - II w (k) - v (k) ] 
pi L p pi p pi J 


(A- 2-90) 


where T is the generalized inverse of given by 

T = (M T M J ” 1 M T . 

pi ^ pi pi-* pi 


(A-2-91) 


Using Eq. A- 2-90 in Eq. A-2-37, we have 


[T „ Y (k+1) - A. T . Y (k) 1 = fT . II w (k+1) - A„ T n n , w (k) 1 
1 pi p i pi p v ' J l pt pt p i pi pi p J 


+ lV V p«! <k+1) - A * T p» V P t <k)1 + D «“ <k) ' 


(A-2-92) 


Since Eq. A-2-92 is in the form of a linear least-squares estimation problem 
for noise processes, Ref. [28] uses it as a basis to identify the noise 
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statistics (i.e., mean and variance of w and v). It may be noted that a 
direct recursive expression for Yp(k) can be obtained from Eqs. A-2-37 , A-2-42 
and A-2-90 as: 


Y (k+1) = M A T Y (k) + M D 0 (k) 
p pi £ pi p ' pi £w 


+ t n P *. V k+1) - V \ V ”pe "p <k> l 


+ [V k+ 1> -V'hTptVW] • 


(A-2-93) 


The advantage of Eq. A-2-92 over Eq. A-2-93 is that the estimation equation is 
of dimension n only (rather than (p+l)m of Eq. A-2-93). 


Extension to Include Control Signals 

We can easily extend the parity generation approach to systems involving 
control variables u(k). Formally, assume that the system dynamics are given 
by 

x(k+l) = A^ x(k) + u(k) + w(k) (A-2-94) 


y(k) - x(k) + u(k) + v £ (k) ( A _ 2 - 95 ) 

where x(k), W£(k) , V£(k) and y(k) are as defined in Eqs. A-2-37 and A-2-38. 
u(k) is an n u vector of control variables. As before, in order to generate a 
pth order parity relation, we form a window of observation Yp(k) , satisfying 

Yp ( k) - M pJi *(k> + r pt u p (k) + n pt w pl (k) + v t (k) (*-2-96) 

where Y^Ck), M^, IIp^, w^(k) and v^Ck) are as defined in Eqs. A-2-40 and A- 
2-42. The control related matrix, r . and extended control vector U (k) are 

r ~ r 

given by 
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p * 


C B, 
l l 


c *r\ - 


0 

0 


(A-2-97a) 


Up(k) = [u(k-p) u(k-p+l) ••• u(k) ] . 


(A-2-97b) 


Depending on whether the control signals are accessible for measurement or 
not, we can formulate two parity generation problems. 

Problem 1 ; (Control Signals are not Accessible for Measurement) 

Find a (p+l)m by t orthonormal parity transformation matrix G such that 
tr[G^CG] is minimized, where 


£ ^ E * iy k > v k) 1 • 


(A-2-98) 


The average ACF C can easily be evaluated, once the ACF of denoted by 
under each of the model hypotheses l, 1= 1,2,...,L is known. The solution of 
this problem involves finding the t eigenvectors of C corresponding to the t 
smallest eigenvalues. 

Problem 2 : (Control Signals are Accessible for Measurement) 

Find a (p+l)(m+n u ) by t orthonormal parity transformation matrix G such 
that tr[G C G] is minimized, where C is the average ACF of the augmented 

Si 3 

vector of outputs and control variables Y (k) given by 

pa 


Y T (k) £ [Y (k) U (k) ] 
pa p P 


(A-2-99a) 
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(A-2-99b) 


C (k) 
a 


A 


L 

Z 

£=1 


P„ E. 
£ £ 



< k > 1 • 


The average augmented ACF C is of the form: 

cl 


where 



C 

yu 


L 

Z 

£=1 


P £ r p£ C u£ 


(A-2-99c) 


(A-2-99d) 


C 

u 


L 

Z 

£=1 



(A-2-99e) 


The solution of problem 2 involves finding the t eigenvectors of C corres- 
ponding to the t smallest eigenvalues. 

It may be noted that the key to the solution of problems 1 and 2 lies in 
the computation of the ACF of U p (k) under each model hypothesis £, C^. The 
ACF of Up(k) can, for example, be obtained in one of the following three ways: 

a. u(k) is given by the feedback control law: 


u(k) = k p x(k) . (A-2-100) 

b. u(k) is given by the proportional-integral feedback control law: 

k-1 

u(k) = k y(k) + k„ Z y(i) . (A-2-101) 

1 * i=0 


c. u(k) is an unknown, but bounded function or equivalently, Up(k) has 
known autocovariance function, C^ under each model hypothesis £. 
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A-2.5 SUMMARY 


In this section, minimum entropy concept was used to provide a unified 
view of parity generation encompassing static and dynamic systems with and 
without noise and model uncertainty. Using this concept, an explicit non- 
linear equation (Eqs. A- 2-34 and A-2-59) is derived for the robust parity 
transformation matrix, G in the presence of model uncertainties. In order to 
simplify computations, we have approximated the sum of Gaussian densities with 
a single Gaussian density with the same mean and covariance. This assumption 
led to a simple interpretation of G in terms of the eigenvectors corresponding 
to the smallest t eigenvalues (t = required number of parity relations) of the 
weighted sum of covariance matrices under various model hypotheses. This 
result was contrasted with the previous results of Chow, Lou, Willsky and 
Verghese [12], [13]. In addition, we derived a frequency domain algorithm to 
compute the parity transformation and have provided numerous interpretations 
of parity relations. Some of these interpretations are as follows: 

1. For a deterministic system with no model uncertainty, the parity 
relations lie in the unobservable subspace under normal system 
conditions. That is, the range space of parity vectors is ortho- 
gonal to the range space of the observability matrix. 

2. The parity relations can be viewed as reduced order ARMA models 
of individual outputs. 

3. Parity relations can be interpreted as the prediction error 
filters (PEFs) that arise in the spectral estimation literature. 

4. Parity relations correspond to near perfect correlations among 
measured outputs. 

5. Parity relations are the directions of least energy. 

6. Parity relations have minimum entropy (or the greatest certainty 
or robustness). 
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A-3. REDUNDANCY RELATIONS FOR ROBUST FAILURE DETECTION AND 
ISOLATION 

A-3. 1 PROBLEM DESCRIPTION AND MOTIVATION 

The methods of Section A-2 do not take any specific failure mode into 
account but are aimed at selecting a set of robust parity relations that pro- 
vide the smallest values for certain metrics under normal operating condi- 
tions and model uncertainty. However, the presence of robust redundancy in 
the sense of the previous discussion does not guarantee that all the failure 
modes can be detected and distinguished. In addition, the information about 
the presence or absence of a particular failure mode generally accumulates 
over time. Failure decisions making use of a time sequence of parity rela- 
tions and/or measured system outputs will have lower error probabilities than 
"single-shot" decisions, if the proper use of failure information is made. 

In this section, we develop statistical distance-based criteria for optimally 
robust ,parity generation for the detection and isolation of specific failures 
and for the information collection phase of the FDI system. 

The statistical distance based criteria serve four major roles in our 
robust FDI formalism. First, if the failure modes are strongly observable (in 
the sense of achieving a prescribed probability of error for a specified 
detection delay) through the use of the robust parity relations generated in 
Section A-2, then the robust FDI criteria of this section specify the optimal 
method of combining the parity relations over the detection window. That is, 
information collection can be conceptualized as the generation of a single 
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detection parity check (or "detection decision statistic") from the robust 
parity relations of Section A-2. Second, if certain failure modes are not 
strongly observable through the robust parity relations, then the robust FDI 
criteria of this section can be used to generate detection parity checks 
("detection decision statistics") from the system output. The parity checks 
so generated are not optimally robust in the sense discussed in Section A-2, 
but are sensitive to a particular failure mode. Third, if one or more failure 
modes are not distinguishable through the robust parity relations and detec- 
tion parity checks, the robust FDI measures can be used to generate Isolation 
parity checks ("isolation decision statistics") that are most sensitive to 
distinguish the particular pair or pairs of failure modes. In this case, the 
operation of an isolation parity check is triggered by the corresponding 
detection parity checks. Finally, the statistical distance measure introduced 
in this section provide bounds on the probability of error, and, hence, can be 
used to evaluate the effectiveness of alternative FDI schemes in terms of 
their ability to detect and isolate various failure modes. Thus, the measures 
of this section provide a unified methodology to devise a "divide and conquer" 
approach to the robust FDI design problem. In this section, we restrict our 
attention to the second and fourth roles, viz., the generation of detection 
parity checks from the window of system measurements, Yp, and the evaluation 
of alternative FDI schemes. However, the theory is applicable for the other 
two applications mutatis mutandis. 

The specific problem we consider here is the choice of parity checks for 
the robust detection of a particular failure mode. We assume that the system 
model under normal operation is 
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(A— 3—1 ) 


x(k+l) = A N£ x(k) + D Nj1 w(k) 

y(k) = C N£ x(k) + v(k) (A-3-2) 

while the system model under failure is 

x(k+l) = A^ x(k) + D Fi w(k) + d p (k) (A-3-3) 

y(k) = C p£ x(k) + v p (k) + b p (k) (A-3-4) 

where dp(k) and bj?(k) account for additive failure effects (e.g., biases, 
drifts). In this case, we would like to find parity checks p(k) that result 
in maximally dispersed statistics under failed and normal modes of system 
operation. 

Ideally, the optimum parity relations are those that minimize the proba- 
bility of making erroneous decisions. However, in most situations an analytic 
expression for the probability of error is difficult to derive, and, even if 
it can be found, the expression is too complicated for optimization. There- 
fore, it is useful to search for criteria that are easier to evaluate and 
optimize and that are, in some sense, related to the probability of error. 
Statistical distance measures such as the I-divergence , the J-divergence , and 
the Bhattacharyya distance between two probability distributions under two 
hypotheses (normal and a failure mode or two different failure modes) provide 
such easily computable criteria. These distance measures have found wide 
applicability in several areas of systems engineering, and most notably in 
pattern recognition [17], [22], control systems [ 29 ] — [ 3 1 ] , communications [32] 
and information theory [33], [34]. They also have a long history and utility 
in statistics [ 35 ] — [ 39 ] • In this section we propose criteria based on 
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J-divergence and the Bhattacharyya distance to generate parity relations to 
achieve maximum discrimination among failure modes. Before, we solve the 
parity generation problem, we provide a brief background on the distance 
measures and some of their useful properties. 

A-3.2 THE DIVERGENCE AND THE BHATTACHARYYA DISTANCE MEASURES 

J-divergence is a measure of dissimilarity between two hypotheses or 
probability distributions. To introduce the concept of J-divergence, consider 
two hypotheses N (normal) and F (failed), and let y be a t vector of parity 
relations. We denote by p( y/N) and p( y/F) be the two conditional probability 
density functions of y under normal and failed hypotheses respectively. Then, 
the discriminating information for hypothesis F over hypothesis N is given by 
the log likelihood ratio, Lp^: 

p( w/F) 

l FN = to • (A-3-5) 

p( y/N) 

The average discriminating information for hypothesis F is defined by 

p(y|F) 

I FN(P» t ) = / P(y| F ) to ; — dy (A-3-6) 

y p(y|N) 

where we have shown the explicit dependence of the average discriminating 
information on the order and the number of parity relations. The distance 
measure Ip^ is often called the Kullback-Leibler (K-L) information measure or 
the I-divergence. The discriminating information for hypothesis N versus 
hypothesis F is given by the log likelihood ratio: 

p( y/N) 

Lnf = to = - Lfn (A- 3-7) 

p( y/F) 
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The average discriminating information for hypothesis N is given by the K-L 


measure: 

P( u/N) 

^(Ptt) = / p( p/N) Jin dp . (A-3-8) 

V p( p/F) 


Note that Ipp = Ijjn = 0. However, in general, the I-divergence is not 
symmetric, i.e., 


FN 


* I 


NF 


(A- 3-9) 


The total average information for distinguishing hypothesis F from hypothesis 
N is the J-divergence, first introduced into the statistical literature by 
Jeffreys [40,41]. The J-divergence is defined as the difference in the 
average values of the log likelihood ratio: 

p( p/F) 

JfnCp.O “ / [p(y/F) - p( p/N) ] in dp . (A-3-10) 

V p(p/N) 


It is easy to see from Eqs. A-3-6 and A-3-10 that 

J FN(PiO = lFN(P»t) + iNFCP.t) • (A— 3—1 1 ) 

Thus, the J-divergence is a symmetrized form of I-divergence. The J-divergence 
satisfies several useful properties. 

1. JpN > 0 for F*N 

2. J FF - Jnn = 0 

3. J FN - Jn F 

4. JpN is additive for independent measurements, i.e., divergence based 

on t independent measurements (e.g., components of p) is equal to the sum of 
the t divergences based on each measurement separately. 


A-62 



( A— 3—1 2 ) 


m 

J FN (lJ l U 2’***’ lJ t ) = k f 1 J FN ( \ ) * 

However, J-divergence does not satisfy the triangular inequality required of 
a metric: 

JpN + Jnf' > ^FF 1 in general . (A— 3~13) 

Finally, the Bhattacharyya distance, Bp^* is defined as the negative log 
of the Bhattacharyya coefficient, ppN# 

0.5 

PFN “ / [pC y ! F) p( y | N ] dp (A-3-14) 

V 

Bfn ** -in PFN • (A-3-15) 

Since / p(p/N)dp =* / p(p/F)dp = 1, Bhattacharyya coefficient may be regarded 
U P 

as the cosine of the angle between the two vectors p( p/F) and p( p/N) in the 
space spanned by the parity vector p. Thus, pfn is a measure of correlation 
between the distributions of p under hypotheses F and N. Therefore, the 
Bhattacharyya coefficient, pfn lies between 0 and 1, and the Bhattacharyya 
distance, Bfn is bounded by 0 < Bfn < •• Note, in particular, that 
PFF = PNN = i an <l Bpp = Bun =0. As with the J-divergence, Bhattacharyya 
distance need not satisfy the triangular inequality required of a metric. 

Example 3.1 

Evaluate 1-divergence , J-divergence and the Bhattacharyya distance for a 
Gaussian random vector p of dimension t characterized by the following two 
distributional parameters under two hypotheses F and N. 

p( p/F) = N(p F ,E F ), p( p/N) = N ( %) 
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where we have assumed nonzero mean, for u under hypothesis N, for the sake 
of generality. The logarithm of the likelihood ratio, denoted by Lpjj, is 

l f» ' - ln det ‘Vf* 1 + ^ - V* % 1<n - V 

- - (u - Up) 1 J^ 1 ( U - P p ) . (A-3-16) 

The I-divergence is 1^^ = / p(p/F)L FIiJ dp and has the form: 

I FN = “ 2 £n det ( Z F Z N ^ + 2 tr ^ E N I t 1 

+ 2 (U F " W N ) E N K ~ » j n) (A-3-17b) 

- - j tn d.t (Ip Ip 1 ) + - 2 tr [Ip (iQ 1 - e; 1 )] 

I XI 

+ - - ^n) (W F - V • (A-3-17b) 


The J-divergence is given by Jppj = Ip^ + INF and has the explicit form: 

1 1 'i , /■ _ — 1 . ~1 1 r~ - -i r* ~ ^T 


J FN o tr ^ E F "* Z N^ ^ Z N E F ^ + ( Z N + Z F ^ ^ ^F ^ (A-3-18a) 


“ 2 tr ih Z N 1+ Z N Z F 1+ ^F 1+ O CWp “ %) (Pp ~ * (A-3-18b) 


The Bhattacharyya distance, Bpjj can be determined, after some tedious 
computations, as 


b fn - ; «" (4 + g e ; 0 - 5 - %°- 5 i + g (v - H) [^r- N 


( A-3-19) 
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Two special cases are of particular interest 


Case 1 . Equal covariances under two hypotheses F and N: Ep = = E 

The I-divergence in this case is symmetric and is given by 


1 _ __ 'J' _ 

*FN = 2 ~ ^ ^ ^ * (A-3-20) 

The J-divergence is 

J FN ( y F ~ ^ Z ^ ^ = 2 I FN • (A-3-21) 

It may be noted that the term (up - pfj ) E (yp - mn) is the Mahalanobis 
generalized distance . The Bhattacharyya distance in this case is 

B FH ■ i ("f ■ * “J ■ -f • (A-3-22) 

Case 2 . Equal means under two hypotheses, lip=yN* The I-divergence, the J- 
divergence and the Bhattacharyya distance are given by 


I pN - - ; Hn det (l p tr (£ p l" 1 ) - * (A-3-23) 


J FN ’ i “ [ £ F £ »‘ + £ H £ f‘ I' 1 


-1 


(A-3-24) 


B 


FN 


1 

- In det 
2 





E 


-0.5 

F 


)/ 2 ] • 


(A-3-25) 


-1 

Note in particular that when UEp E^ -It U is small with respect to any suitable 
norm, the I-divergence has a simpler form. In this case, note that 
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FN 


- J *■> d “ ( s r e n‘J + i tr £ n‘) ' \ 

- *» det «£ f C - V + \) + i tr ([ £ F - r n) C) • 


Expanding the £n term up to second order, we have 


d FN - - ; » [ £ f £ n‘ - £ tl + ; tr lt £ F h 1 - ^l 2 ) + ; cr K £ f - 


RJ 


: tr K £ f £ n‘ 



(A-3-26) 


Thus, IpN > 0 for all Ep * Ejj* 

A-3.2.1 Relationship Between Distance Measures and Probability of Error 

The J-divergence and the Bhattacharyya distance measures are important in 
their own right, but their value is enhanced by the fact that they provide 
bounds on the probability error (misclassif ication) in statistical hypothesis 
testing problems. In the binary hypothesis testing problem, let p^ = (p(l) 
p(2) , . . . ,p(k) ) be the time-sequence of parity checks, each parity check p(i) 
being a t dimensional vector. Let p( ptyF) and p(p^/N) be the probability 
density functions of pk under hypotheses F and N, respectively; and 6p, 
6 n( = 1~$f) t * ie a P r * or *- probabilities (based on mean-time-between-failures 
of components) that the system is failed and normal, respectively. Then, the 
probability of error of the maximum a posteriori (MAP) decision rule is P e (k) 
and is given by 

P e (k) = / min (6p p(j*/F), 6 N p( p^/N) ) dpk (A-3-27) 

pk 
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The multi-dimensional integral in Eq. A-3-27 is extremely difficult to eval- 
uate, even with numerical integration. For this reason, the J-divergence and 
the Bhattacharyya distance are used in signal selection problems of commun- 
ication theory and in feature selection problems of pattern recognition, 
because they provide the following bounds on P e (k) [32-34].* 

0.5 - 0.5 (1 - 4 6 n S f P fn OO )°* 5 < P e (k) < (^dp) 0 * 5 p FN (k) (A-3-28) 

0.5 min (<5 n 6 f ) exp (-J FN <k)/8) < P £ (k) < (d^) 0 * 5 (j pN (k)/4 )~°' 25 

(A-3-29) 

with upper bound in Eq. A-3-29 valid only for Gaussian statistics. In Eqs. 
A-3-28 and A-3-29, p FN (k) is the Bhattacharyya coefficient and J pN (k) is the 

k 

J-divergence based on p : 

P <k) - / (p(y k /F)*p(pk/N)) 0 ’ 5 dp k (A-3-30) 

FN pk. 

J FN = / (p(p k /F) - p(y k /N)) Sn dy k . (A-3-31) 

11 p( U k /N) 

When the signal to noise ratio is high (i.e., as P e ->0), it is stated in Ref. 
[32] that 

in P (k) » In p(V = - B (k) as P (k)-K) (A-3-32) 

e FN FN e 

where Bpjj(k) is the Bhattacharyya distance between the probability 
distributions p(y k /F) and p(y k /N). 

*Note that, in the subsection, we have suppressed the dependence of p, J and I 
on the order of parity relations, p and the number of parity relations, t for 
convenience of notation. 
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A-3 .2.2 Asymptotic per Sample Distance Measures 

As the number of samples of parity checks, k+», the asymptotic per sample 
I-divergence is 

i 

X FN = lim " I FN (k) (A- 3-33) 

k -*■<*> k 


where 


I pN ( k ) = / p(p /F) An 


p( P k /F) 

p( u k /N) 


dp 


(A-3-34) 


It is shown in Ref. [33] that Ip N is given by 

ipN = (4k)_1 f ^ tr f%N (a)) V (w) ' ~ in det (%N (w) V (UJ) ) 


T _ x 

+ (y F (ai) - p N (uO) * yN (m) (ppCm) - p^w))} d 


0) 


(A-3-35) 


where $ (u>) and $ (u) are the spectral density matrices of residuals under 

pF pN 

the failed and normal hypotheses, respectively. In addition, p ? ( u>) are the 
discrete Fourier transforms of p p (k) and p^Ck) , respectively. The asymptotic 
per sample J-divergence is evaluated as J pN = 1^,^ + I^p,. Similarly, the 
asymptotic per sample Bhattacharyya distance is given by 




2 TT 


-0.5, 


.0.5, 


FN 


pF 


pN 


+ 4°; 5 (03) r° ,5 (ui)/2)] + - (HpU) -HjjCw)) 1 [($ yF (m) (A- 3-36 ) 


%N (u)/2 ^ 1 ( W F (u)) ~ dUi 
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Interestingly, Eqs. A-3-35 and A-3-36 are valid even if the deterministic mean 
components pjj and up are functions of frequency, w. This generality allows 
us to consider a variety of bias, ramp and other periodic failure modes. 
However, asymptotic measures are of limited utility in FDI, where one is 
interested in the transient effects of failure modes. 

A-3.2.3 Relation to Fisher Information Matrix (FIM) : 

In parameter estimation problems (i.e., a continuum of hypotheses), it 
turns out that the I-and J-divergence measures , as well as the Bhattacharyya 
coefficient, p (or equivalently, the Bhattacharyya distance, B) are related to 
the Fisher information matrix (FIM). The use of FIM for optimal input design 
for system identification, and for the design of insensitive feedback control 
laws is well known [42], [A3]. This relationship provides additional motiva- 
tion for the use of divergence measures and the Bhattacharyya distance 
measures for parity generation. To show this relationship, consider the 
Bhattacharyya coefficient between p( p k / 0) and P( y k / EH-A0) , denoted by pe,(H-A0 
is: 

P = / (p(y k /e)*p(u k /e+-A8)) 0 ’ 5 d y k (A-3-37) 

0,(H-A0 y k 

The Fisher information matrix, Fg, on the other hand, is given by 

3 3 T 

F© = / f — £n p( p k / 0) ) ( — in p( u k / 9) ) p( w k / 0) dp k (A-3-38a) 
y k a0 90 

= / ( — p(p k /0)) ( — p( y k / 0) ) p(p k /0) dy k (A-3-38b) 

u k 8 0 80 

To exhibit the relationship between Eqs. A-3-37 and A- 3-38, we expand 
p( p k | 0+A0) to second order using Taylor series: 
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X 3 1 x 3 2 

p(y k /W-A0) = p(y k /9) + AS • — p(y k /0) + - A6 — r p(y k /9) + ofllASI 3 ) (A-3-39) 

36 2 36 z 

g2 

where p(y k /0) is the Hessian of p( y k / 6) with respect to 9 and D. 0 is the 

30 2 

Euclidean norm. Using Eq. A-3-39, we can evaluate P0 t efA0 to second order as 


, r , 1 T 3 , T 3 2 

P0,0+A0 - / [p(p k /6) + - A0 — p( y k / 0) + A0 p(y k /0)A0 


yk. 


30 2 


1 X ^ r r | 

A0 ( — p( y k / 6) ) ( — p(y k /0) (p(y k /0>) A0] dy k (A-3-40) 

8 3 0 30 


Note that / p(y k /0)dy k = 1, and also the second and third terms integrate to 


zero, and Eq. A-3-40 reduces to 


1 T 


P0.&+A0 “ 1 40 Fq A0 

8 


(A-3-41) 


Thus, in a local sense, we have the following: small Bhattacharyya 

coefficient, p implies larger Bhattacharyya distance, which in turn, implies 
larger information content in y k to distinguish (or identify) 0. 

Similarly, one can show that the I-divergence , for small parameter 
variations, is given by 

1 T 

I 0,0fA0 = “ 40 Fe A6 (A-3-42) 


and the J-divergence is given by 

T 

J 0, 0+A0 = 40 Fe A9 (A-3-43) 

The relations (Eqs. A-3-41 - A-3-43) imply that 1-p, I and J satisfy metric 
properties of topology locally (but not globally). They also imply that if 
the two hypotheses F and N differ by a small amount, the Fisher information 
matrix provides a convenient measure of distinguishability. 
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A-3.2.4 Distinguishability of Hypotheses (Failure Modes ) 

In the zero mean case, i.e, yp(<u) = yji( w) = 0, in Eq. A-3-36, it is shown 
in Ref. [34] that the Bhattacharyya coefficient ppfl(k)->0 (and, consequently, 
the probability of error, P e (k)+0) as long as the asymptotic per sample 
Bhattacharyya distance Bpjj of Eq. A-3-36 is nonzero. A necessary and suf- 
ficient condition for Bpjj = 0 is that the eigenvalues Xi, i = 1,2 t of the 

generalized eigenvalue equation: 

V (u,) g i = X i *uN (t0) g i ; i=1 » 2 »***> t (A- 3-44) 

satisfy 

Xi(uj) = 1 almost everywhere on m e [ 0 , 2 tt] . (A-3-45) 

Equation A-3-44 provides a simple check on the "discriminability” (or distin- 
guishability) of two hypotheses for a given residual generation mechanism. 
Looked another way, it also tells us that as long as the Bhattacharyya dis- 
tance, Bpfj is nonrero, one can devise parity relations that achieve discrim- 
ination between two hypotheses. However, this analysis does not address the 
issue of speed of such discrimination, which is discussed next. 

A-3 .2.5 Speed of Discrimination 

It is also shown in Ref. [34] that, for stationary dynamic models, the 
Bhattacharyya coefficient PFN(k) (and hence P e (K)) tends to zero exponentially 
with the number of observations, k, and that the rate of covergence is 
governed by the asymptotic per sample Bhattacharyya distance, Bpjj, as 

exp {-a(k) - k Bpfj} < PFN(^) < ex P {<*(k) - k Bp^} (A-3-46) 
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where a(k) is any function that satisfies lim k~l a(k) -*• 0 and Bpjj was defined 

k-*-°° 

earlier in Eq. A- 3-36. Since ppfj(k) provides an upper bound on the probabil- 
ity of error (see Eq. A-3-38), we can compute a lower bound on the detection 
delay, k d . For a desired probability of error, P e des 

k d £n f P edes * (Wr°* 5 ]/i F N * (A-3-47) 

In addition, Eq. A-3-46 says that a parity generation mechanism that has lar- 
ger asymptotic per sample Bhattacharyya distance Bpjj is likely to be an 
exponen- tlally faster discriminator of failed and normal modes than one with 
a smaller distance. Second, when a decision statistic is being designed, the 
bounds in Eqs . A-3-28 and A-3-29 provide a means to select the required data 
collection window for a given probability of error. Specifically, if we view 
the data collection process as part of the parity check generation process, 
i.e., we are generating parity checks of relatively high-order p, then p 
should chosen so that 

P FN (p ’ t} “ (Vf)' 0 * 5 * P edes (A-3-48) 

where we have shown the explicit dependence of the Bhattacharyya coefficient, 
p on the order and number of parity relations p and t. 

A-3.2.6 Divergence and the Bhattacharyya Distance in Uncertain Models 

When the model parameters are uncertain as in Eqs. A-3-1 through A-3-4, 
we define the I-divergence , lFN(P»t)* as the average discriminating infor- 
mation for failure hypothesis F with respect to hypothesis N over all possible 
models, l = 1,2, That is, 

W* 0 ■ * P * E « lV P ’ t)/ *l (A-3-49) 

X> X 
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where lFN/£(P» c ) * s tbe I-divergence between hypotheses F and N conditioned on 
model i. Letting Ep^ and be the steady-state covariances of p(k) under 
hypotheses F and N, under model fc, and pp^ and be the corresponding means, 
we can compute iFN/fcCptt) from Eq. A-3-17 as: 

I FN/£^ P,t) = ~ [' £n det C E p £ ) + tr (^pj, “ I t^ 

+ ( P F£ ~ P NjJ hi (**F£ ” W N£^ (A-3-50) 

Similarly, the J-divergence is given by 

A L 

J FN (p ’ t) = ^ P £ E £ ^FN/^’^ (A-3-51) 

where Jfn/jiCp* 11 ) is obtained from Eq. A-3-18 as 

J FN/£ (P,t) = 2 tr ^ Z F£ ~ E N£^ ^F£ ~ E N£^ 

+ 2 ~ ^ Z F£ + E N£^ ( P F£ ~ (A-3-52) 


In a similar vein, the Bhattacharyya coefficient PFN(p,t) is given by 


P FN (p,t) ^ P £ P FN/£ (p,t) 


(A-3-53) 


where PpN/^P.t) is S iven l 22 l 

p FN/» (p * t) ■ det t E ?I 5 * 5 + E « 5 e m / 2 ! ' exp <- ; (>Vt - “m) t 




(A-3-54) 
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A closed form expression for Bpii(p,t) = - £n PFN(P»t) is not possible, since 
ppN(p>t) in Eq. A- 3-53 is a sum of exponential terms. However, since - in x 
is a convex function of x, we can use Jensen's inequality [22] to obtain an 
upper bound on Bp{,j(p,t): 

L 

Bp^p.t) < Z P £ B FN/£ (p,t) (A-3-55) 

1 

where Bp^/^(p,t) is obtained from Eq. A-3-19 as 

D , x 1 . , r 0.5 _-0.5 , -0.5 ,0.5.,! 

B FN/£ (p,t) ~ 2 Zn det t Z F£ ^N£ + E F £ E N£ /2 J 

+ e ^F£ ~ W N£^ ( E F£ + Z N£ ^ 2 ) ( U F£ ~ y N£^ (A-3-56) 

If we make the Gaussian sum approximation in Eqs. A-3-49 - A-3-56, then 
the I-divergence lFN(p,t), the J-divergence JfnCp* 11 ) and the Bhattacharyya 
distance Bp(j(p,t) (approximately) reduce to Eqs. A-3-17 -A-3-19, respectively, 
with the following identifications* 

L 

L p = l ? i e F £ (A-3-57a) 

L 

Z N = P £ E N£ (A-3-57b) 

L 

U F = Jj P £ ^F£ (A-3-57C) 

L 

P N = l P £ ^N£ (A-3-57d) 

£=1 


''The approximation stems from the neglect of mean related term in the 


covariance of the Gaussian sum approximation, i.e., Eqs. A-3-57a and A-3-57b. 
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A-3.2.7 Extension to Multiple Hypotheses 


If there are more than two hypotheses H e H (e.g., Hq = N, = F, H 2 = 

F' etc.), then it appears that the following average J-divergence and the 
Bhattacharyya coefficients are suitable measures: 

J(P»t) ~ 1 I <$H <$H' JHH'(P>t) (A- 3-58) 

HeH H' eH 

p(p»t) = 1 £ 6 h <Sh* PHH'(P>t) • (A-3-59) 

HeH H' eH 

In the multiple hypothesis case, it is known that the total error probability 
P e tot and the pair-wise error between hypotheses H and H' , denoted by PeHH' » 
given by Eq. A-3-27 are related by the inequality [44] 

**etot * “ 1 1 p eHH' • (A-3-60) 

2 HeH H' eH 

Since the Bhattacharyya coefficient phh’ provides an upper bound on the 
probability of error (see Eq. A-3-28), we have 

p etot < ~ I l S H 6 H' PHH 1 • (A-3-61) 

2 HeH H' eH 

1 

In addition, since 6 h^H' * ~ , we have 


p etot < 7 l l PHH’ • (A-3-62) 

4 HeH H' eH 

The average measures have limited practical utility in our decentralized FDI 
approach. We use pairwise measures to generate the multilevel parity 
generation scheme discussed in Section A-l and subsection A-3.1. 
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A-3.3 PARITY GENERATION TO MAXIMIZE THE DIVERGENCE MEASURES AND BHATTACHARYYA 
DISTANCE 

In this subsection, we use the divergence and the Bhattacharyya distance 
based measures to generate parity relations for the system described by 
Eqs. A-3-1 through A- 3-4 to detect a failure. The key idea here is to gener_ 
ate the parity transformation matrix G that maximizes separation between 
normal and failure modes of system operation, i.e., to find parity checks that' 
are representative of the dissimilarities between normal and failure modes. 

Using the Gaussian sum approximation embodied in Eq. A-2-60; letting C N 
and Cp denote the average ACF counterparts of Eq. A-2-60, and Yp P denote the 
mean of Y^ under failure hypothesis (note that Y^ N = 0 by assumption) , we 
have 

^ = gT C F G (A-3-63a) 

Efj = G Gp G (A-3-63b) 

T - 

“ G Y p p (A-3-63c) 

^N = gT V = 0 (A-3-63d) 

Assuming, without loss of generality, that the failure occurred at k-p, Ypp is 
given by 

V = r p d pF (k) + b pF (k) (A-3-64) 

where T can be evaluated directly from the system matrices {C^,A^} and prior 
probabilities, as: 
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(A-3-65a) 


r = l P £ r p£ 
£= 1 


p£ 


0 

°F£ 

C F£ A F£ 


C F £ A ?~£ 1 


0 

0 


.. o 
.. o 


C F£° — 


0 


Cp£ 0 


(A-3 


dp F (k) “ [d(lc— p) d(k-p+l) 
bp F (k) = [b(k-p) b(k-p+l) 


d(k) ] a (p+l)n vector 
b(k) ] a (p+1 )m vector 


We can explicitly evaluate the divergence measures and the Bhattacharyya 
distance (Eqs. A-3-17 and A-3-19) in terms of G, C N » C p and Y pF as: 




T? r~ T» „v-l 


; tr [(g t c f g) (g t c n g) _1 ] 


(A- 


j FN <p» t) = ; g [cc i vr + (c'vr 1 ]g^ 




PF 


pF 


+ ; tr [(gV)- 1 ^) + (g t c f g) 1 (g t c n g) - 21 J (A _ 


-65b) 


3-66) 


3-67) 
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1 r , T"> .-0.5 T~ .°. 5 

B FN (p » t) = 2 * n det t(G C N G ^ ( G C F G ) 

, T~ T~ .0* 5 i 

+ (G C f G) (G C n G) !2] 


1 -T r ~ . , 1-1 T- 

+ - V° [G ( c n + c p) g/2 ! G V 


(A-3-68) 


The general optimization approach for all distance measures involves a 
gradient-type scheme. However, in two special cases, which are of consider- 
able interest in the FDI problem, the optimization provides explicit solu- 
tions. The general problem can also be solved suboptimally. We will discuss 
these next. 

A-3.3.1 Detection of Bias Failures of Known Magnitude 

In this case A^ = Ap^, ■ Dp^ for all l . This implies that the 
average ACFs C p = C N = C , and the distance measures (Eqs. A-3-66 - A-3-68) 
are scaled versions of the Mahalanobis distance [17]: 


- i w p *° ■ 4 B FN (I> • t, ■ v T G ( GTec r 1 c T > 


(A-3-69) 


The optimal parity relation to maximize the criterion in Eq. A-3-69 is given 
by the following theorem. 


*The approximation Cf=Cn comes from neglecting the mean related term in the 
covariance of the Gaussian sum approximation. 
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Theorem 3.1: There exists a single best parity relation (i.e., t=l) that 


maximizes Eq. A-3-69 and is given by 


u,00 - g[ yk> - c -1 Y p OO 


The parity transformation G* is 


* ~-l - 

G " *1 " C Y pF 


and the optimal distance measure is 


I FN^ P,t ^ 2 J FN^ P,t ^ “ ^ B FN^ P,t ^ 


1 -T -V~l - 

2 Y pF C Y pF 


Proof : The I-divergence measure IpN(p»t) in Eq. A-3-69 can be 

trace functional 


V P>t) = ~ tr ^ G G )' 1 gT > Y pF G ] * 


The optimal G* satisfies the gradient relationship: 

_ — t * , *t ~ * A -l 

0 ’ Vfn ■ V V 0 ( G C G ) 


^ , *T ~ I *T — — t * *T ^ * — 1 

- C G (G C G ) G Y pp Y pp G (G C G ) =0 


~-l *T 

Pre-multiplying and post-multiplying Eq. A-3-74 by C and G 
the simplified relationship: 


1 — — T * * *T ^ * — 1 *T — — T * 

C Y _ Y G = G (G CGl G Y Y G 
pF pF ^ J pF pF 


(A-3-70) 


(A-3-71) 


(A-3-72) 
written as a 


(A-3-79) 


• (A-3-74) 

~ * 

C G , we have 


(A-3-75) 
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it is well known [22] the the product (G C G ) * G Y pF Y pF G can be 
diagonalized by an orthonormal matrix ¥ such that 


*T * —i *T — — t * 

(G CG) G Y pp Y pp G ¥ = ? A fc 


(A-3-76) 


Using Eq. A-3-76 in Eq. A-3-75, we have 


1 — — T * * 

c v V G ’ - G ’ A t 


(A-3-77) 


* 

Since a t by t orthonormal matrix does not change I F ^, we can include Y in G . 
Since Eq. A-3-77 is an eigenvalue equation, we conclude that the optimum parity 

k 

transformation G is achieved by selecting the t eigenvectors corresponding to 

1 - -x ~~1 - -T 

the largest eigenvalues of C Y pF Y pF . However, the rank of C Y pF Y pF is 

one, therefore, we obtain a single best parity relation (i.e., t=l). Indeed, 
the largest eigenvalue is 

-T „-l - 

\nax = Y pF ^ Y pF • (A-3-78) 


Equation A-3-78 is obtained by noting the fact that rank of C Y pF Y pF is one 
(i.e., it has a single non-zero eigenvalue), and that the trace of a matrix is 
the sum of its eigenvalues: 


,~-l - -T > -T - 

« ( C Y pF >) ' Y nF ° 1 


pF 


pF 


(A- 3-79) 


Using Eq. A-3-78 in the eigenvalue equation 


~-l 

C 





Y pF S 1 


(A-3-80) 
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Clearly, the optimal g\ is given by Eq. A-3-71. The optimal distance measure 
in Eq. A-3-72 is obtained by substituting Eq. A-3-71 in Eq. A-3-69. 

The parity relation p^(k) can be thought of as an approximate whitener 
followed by a correlator, as shown in Fig. A-3-1. The approximation stems from 
the Gaussian sum approximation in Eq. A-3-60. The parity relation (Eq. A-3-70) 
is precisely the decision statistic obtained when we consider the problem of 

detecting known signals with (p+1) observations in a zero-mean colored noise 
with autocovariance function C and signal mean . To show this relation- 
ship, consider an observation model: 


z(r) = s q (r) + n(r) ; q = 1,2,0 < r < p (A-3-81) 

where n(r) is a zero mean Gaussian noise process. The autocorrelation func- 
tion of (n(0) • • *n(p) ) is 2 . We let 


= 0 (A-3-82) 

S 2 (r) = (k ~ p+r) (A-3-83) 


where y (k-p+r) is the average output over all models SL= 1,2, under fail- 

r 

ure hypothesis. Then, the optimum likelihood test is to decide s^ if and only 
if: 





Z 

P 


> 


1 

n + - 

2 



~-l 

C 



T -v 

s\ c s J 

p2 - p2 J 


(A-3-84) 


where n is the threshold that is a function of prior probabilities, and costs 
associated with false alarms and missed detections. Spj , Sp2 and Zp are 
extended observations of length p, defined by 


A-81 



Figure A-3-1. Whitener-Correlator Interpretation of Detection Parity 
Relation 

z p = [z(0) z(l) ••• z(p>] = Yj(k) (A-3-85a) 

X A 

Spi = [o 0 ••• o] ( A-3-85b) 

S p2 " (k-p+r) y F (k) ] = V • (A-3-85C) 

The left hand side of Eq. A-3-84 is precisely the information collection 
phase that we alluded to in Section 1. Thus ui(k) obtained from the diver- 
gence And the Bhattacharyya distance based measures is already accomplishing 
the parity generation and information collection phases of the FDI process. 

A-3.3.2 Detection of Noise Variance Changes and Scale Factor Failures 

This case corresponds to the situation where the noise processes have 
different covariances under all failure modes and uncertain models, but the 
additive disturbances, dp(k), and the bias term, bp(k) , are zero. As a re- 
sult, Ypp(k) = 0 and * C^,. With this simplification > maximization of the 

J-divergence and the Bhattacharyya distance yield identical optimal parity 
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transformations G, while the transformation that maximizes the I-divergence is 
different. The optimal G is given by the following theorem. 


Theorem 3.2 : The J-divergence and the Bhattacharyya distance are maximized if 

1 ^ 

the columns of G are chosen to be the t eigenvectors of C^, corresponding to 
the eigenvalues X^ for which 

X r +A r 1 > X q +X q 1 2 I (P +1 > m ; r=l,2,*»«,t (A-3-86) 

while the I~divergence is maximized if the columns of G are chosen to be the t 

- 1 IV 

normalized eigenvectors associated with the t largest eigenvalues of C C . 

N F 

The optimal distance measures for the selected parity relations are: 



(P.t) 



(A-3-87a) 



(P.t) 


1 c , 0.5 -0.5 

- Z £n f A + X / 2 

2 r=l r r 


(A-3-87b) 



(P.t) 



Jin X - 1 ) . 

r J 


Proof: We will consider each measure, in turn. 


(A-3-87c) 


a. Maximization of J-Divergence 

The gradient expression for the J-divergence of Eq. 
given by 


A- 3-6 7 when Y =0 is 
pF 


V G J FN (p>t) " ’ S N G ^ e N G ^ _1 ^ G F G ) ^ G N G ) _1 

- C p G (G T C p G) _1 (G T C n G) (G T e p G)' 1 

+ C p G (G T C n G) -1 + C N G (G T C p G)" 1 . ( A-3-88) 
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/v - *1 •jy 

Pre-multiplying Eq. A- 3-88 by C N , post-multiplying by (G G), and setting 
the gradient to zero, the optimum G* satisfies the relationship 



* * , *T — 1 

C F G = G (G C N G ) 



(A-3-89) 


, *T ^ *._i 

As before, (G G ) 

transformation if; as: 



c f g 


) 


can be diagonalized via an orthonormal 


*T * _i *T * 

(G C N G ) (G C p G)f S U t , (A-3-90) 

Using Eq. A-3-90 in Eq. A-3-89, we have the eigenvalue relationship: 

C N C F G!=GTA t . 

Since any orthonormal transformation does not change the J-divergence , we can 
include V into G*. Therefore, optimum G* should consist of the eigenvectors 
of C p . Also, note that the eigenvalues of (G C n G ) * (G G ) in 

1 fV. 

the t-dimensional subspace are the same as the t eigenvalues of C N in the 
original subspace. Then J FN (p,t) becomes 

\ + X r‘ - 2 ) ' 

z r=l 


Therefore, in order to maximize J FN (p,t), we choose G to be the t normalized 
1 

eigenvectors of C N C F corresponding to the eigenvalues for which 
A r + A r 1 > X + X q 1 ; q=l , 2 , • • • ,(p+l)m ; r=l,2,*»*,t . 


A-84 



b. Maximization of Bhattacharyya Distance 


The gradient expression for the Bhattacharyya distance of Eq. A-3-68, when 
Yp F =0 (to within a scale factor), is given by 


V G B FN (p,t) = [ £ N G S G ^' 1 ' G F G ^ G F G ^ 1 ] 


[(G T C n G)’*"(G A C f G) (G 1 C N G * ( G a C„ G)"~] 




.-0.5 


,T - 


_0. 5 T - 


0.5. 


(A-3-91) 

Optimal G* satisfies either 



~ * * , *T * —i *T * 

C F G = G (G C N G ) (G C p G ) 


(A-3-92) 


or 

G ^ C N " C F^ G = ° * (A-3-93) 

"fc 

The term G of Eq. A-3-93 cannot be optimum, because this makes G =0 and B FN =0 

/v <v 

(unless C N =C F> in which case it is indeterminate and a meaningless problem). 

* 

Since Eq. A-3-92 is the same as Eq. A-3-89, G should consist of normalized 
1 1 

eigenvectors of C N C F . In terms of eigenvalues of C N C F , the Bhattacharyya 
distance is given by 


V Pit) ■ - 


t 

z 

2 r=l 


0.5 _0.5 

StXi [X r + A r /2 J 


Since £n x is a monotonic function of x, Bp^(p,t) is maximized if we select G 

1 ^ 

to be the t eigenvectors of C N C F corresponding to the eigenvalues A^ for 
which 


0.5 —0 .5 0.5 —0.5 

A + A > A + A ; q=l,2, •••,(p+l)m ; r=l,2,»*»,t 

r r q q 

— i 0.5 —0. 5 o 

Noting that A f + A f = (A r + A r ) -2, we immediately see that the Bhattacharyya 
distance and the J-divergence yield identical parity transformations. 
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c 


Maximization of I-Divergence 


The gradient expression for the I-divergence of Eq. A-3-66 when = 0 is 


given by: 


VFN (p » t) = C N G C F G ) - C F G ( G C F G ) ( G C N G ) * 


. T **■ x-1 ~ , T »*■ \-l 

(G C p G) + C„ G (G C F G) 


- c N g (g t c n g) -1 (g t c f g)(g t c n G) 1 


(A-3-94) 


At the optimum G* , we have 


/s. - 1 ~ * * *T * —i *T ^ * 

C N C F G " G ( G S C ) (C C p G ) . 


(A-3-95) 


Using the same argument as before G is chosen from the eigenvectors of C^. 

1 /V 

The value of the I-divergence in terms of the eigenvalues of is given by 


yp.t) = i (\ - to x r - 1) 

r=l 


Since X^ > An X^ for all X^ > 0, I F ^(p,t) is maximized by selecting G to be 

1 /x 

the t normalized eigenvectors of C M C corresponding to the t largest 

N r 

eigenvalues . 


RELATIONSHIP WITH OTHER MEASURES 

For the scale-factor failures, a variety of criteria similar to the 
I-divergence have been used in the literature. We will discuss four such cri- 
teria here. First in [13] a robust redundancy metric is used for detection, 
Jj, that involves maximization of the difference in the covariance of parity 
relations under failed and normal hypotheses: 
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J^P.O = tr [G T (C F -C N ) G] 


(A-3-96a) 


subject to 

T 

G G = I 


(A-3-96b) 


/ \ 

Thus, if we interpret tr[G C^GJ as a measure of "failure signature" strength 
under failed hypothesis and tr(G C^G) as a measure of noise strength, the 
criterion in [13] corresponds to maximizing the difference in energies between 
the failed and normal modes of system operation. In this case, the optimal 
parity transformation G is given by the t eigenvectors of [Cp-C^J corre- 
sponding to the largest t eigenvalues of (Cp-C N ) . 

A second criterion is related to maximizing the difference in entropy of 
parity relations under failed and normal system operation: 


l 

J 2 “ H F^ p,t ^ ~ 1 VP» t > " 2 det 



(A-3-97) 


This criterion has precisely the same optimal parity transformation G* as that 
obtained by maximizing the I-divergence. This criterion is used extensively, 
albeit in an ad hoc manner, in feature selection problems [22]. 

A third criterion is related to maximizing the signal strength subject to 
a constraint on the noise strength in the parity relations. The objective 
function is to maximize 

J 3 = tr (G T C p G) (A-3-98a) 

subject to 

tr (G T C n G) = i . (A-3-98b) 
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Here, tr(G^SpG) is a measure of the detection capability of the parity rela- 

( -v 

tions and tr[G C^GJ is a measure of the false alarm probability. It can be 
shown that the optimal parity transformation G is the same (except for a scale 
factor) as that obtained by maximizing the I-divergence. 

Finally, continuing on our signal strength and noise strength interprets- 

( T' n * f T~ \ 

tion of tr[G C^G) and tr(G C N GJ, respectively, a natural criterion is to 
choose G to maximize the signal to noise ratio: 

tr (G T C f G) 

j 4 = • . (A-3-99) 

tr (G T C N G) 

This is a very difficult measure to optimize. However, it is stated in [22] 
that the eigenvector solution obtained for the I-divergence criterion provides 
near-optimal solutions for this problem as well. 

A FREQUENCY DOMAIN ALGORITHM 

As is the case with the robust parity checks of Section A-2, the AR fil- 
ter interpretation of parity relations and the asymptotic per sample distance 
measures of Eqs. A-3-35 and A-3-36 can be used to devise a frequency domain 
algorithm for the parity generation, whenever the normal and failure modes 
result in stationary models. We illustrate the frequency domain algorithm for 
the I-divergence measure. Similar algorithms are obtained for the J-diver- 
gence and the Bhattacharyya distance measures. 

The asymptotic per sample I-divergence in the equal mean case, i.e., when 

Y „(k) = Y „(k) = 0, is obtained from Eq. A-3-35 as 
pF pN 
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FN 


= (4u) • / {tr [g T ( j u>) * N ( to) G*(ju))) -1 (G T (ju)) * p ( cu) G*( Juj) ) ] 

0 y y 

- in det [ (G( j oj) $ yN ( w) G (jw)) 1 (G T (joi) 4> yF (cu) G (jw))]} (A-3-100) 


An approximate algorithm for obtaining a p-th order parity relation is as 
follows : 

a. Divide the interval [0,2tt] into p equal intervals and let 


2 TT 

Wf = r — ; r = 0,1,2, ***,p . 

P 


b. As in Section A-2, evaluate 4>ytqC w) and (fypC w) at wr and the t 

orthogonal eigenvectors, g± Tt 1 < i < t of dimension m corresponding to the 

-1 

t largest eigenvalues of <&yN((*>r) ♦yF( u r)* l«t 


G = g, ,g„ »’’*,g an m by t matrix, 
r L6 lr 6 2r ,6 tr J 


(A-3-101) 


c. Normalize each column of g^ r so that 


P 

l 


q=0 



’ir 


1 


where g^ r is the i-th column of G r . The resulting normalized g£ r vectors pro- 
vide the (p+l)m by t parity transformation matrix G. 

As in Section A-2, the result can be expanded to any order p with a 
linear growth in computation time. 
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A-3.3.3 General Failure Modes 


The general case where a failure manifests as a bias and scale factor 
change requires a gradient scheme for optimization. However, suboptimum 
procedures can be generated following the approach proposed for feature 
selection problems [22]. 


MAXIMIZATION OF J-DIVERGENCE 

In the general case, when means and covariances are unequal, we could use 
the following suboptimum procedure to maximize the J-divergence (Eq. A-3-67). 

1 ^ 

a. Form the eigenvectors of C N Cp. Let F be the set of eigenvectors 
and d = F . Let q-th component of d be d^. It is well known that F 

/K. />. 

can be selected such that it simultaneously diagonalizes C N and [22] as 


T T 

F C M F 1 = I F C F 1 = A 
N f 


(A-3-102) 


b. Choose the columns of parity transformation G to be those t columns 
of F for which 

(l + X -1 ) d 2 + A + A -1 > fl + A _1 ) d 2 + A + A _1 for all 
v r-'r r r v q ' q q q 

q = 1,2, •••,(p+l)m , r = 1,2, •••,t . (A-3-103) 


The divergence for t selected parity relations becomes 

t -19 -1 

J FN (p,t) = ^ U 1 + A r ) d r + A r + x r “ 2 } . (A-3-104) 

This approximation works well when the divergence term related to the mean can 
be expressed by a small number of t eigenvectors chosen according to Eq. 
A-3-104. Alternatively, when the mean related term is dominant, we can select 
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the first parity relation according to Eq. 3-70 with C = [6 N + C p ]/2, and the 

^-1 ~ 

remaining (t-1) parity relations from the eigenvectors of C p corresponding 
to the eigenvalues satisfying Eq. A-3-86. However, we lose the orthogonal 
property of the columns of G. 


MAXIMIZATION OF THE BHATTACHARYYA DISTANCE 

In the general case, when means and covarances of Yp are unqual under the 
two hypotheses, a suboptimum procedure for choosing parity relations to 
maximize the Bhattacharyya distance (Eq. A-3-68) is as follows: 

~-l ~ 

a. Form the eigenvector matrix F of C p such that 

~ T -v t 

FC n F = I ; FC p F - A 


Let 



b. The columns of G are selected to be the columns of F for which 


l _12 , °* 5 -0.5 i _1 0.5 -0.5 

- fl+X 1 d + to (X + X ) > - fl+X ) d + to (X + X 
2 '-rJr '-r r ' 2 ^ ; 9 ^ q <1 

for all q = 1 ,2 , • • • ,(p+l)m ; r = 1,2, •••,t . (A-3-105) 

The Bhattacharyya distance for the t selected parity relations becomes 

i 

1 2 

As with the divergence, this approximation works well when the term related to 
the mean can be expressed by a small number of t-eigenvectors chosen according 
to Eq. A-3-105. When the mean vector is dominant, we select the first parity 


0\J 


-1 j2 


d" + to f X 
r v r 


0.5 - 0.5 


+ X 


! 2 ) 


(A-3-106) 


B FN (p » t) = 2 1 

c r= 
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relation according to Eq. 3-70 with C = [6^ + 6p]/2, and the remaining (t-1) 

1 ^ 

parity relations as the eigenvectors of Cp corresponding to the eigenvalues 
satisfying Eq. A-3-86. However, the columns of G are not orthogonal in the 
latter case. 


MAXIMIZATION OF THE I-DIVERGENCE 

In the general case, the maximization of the I-divergence in Eq. A-3-66 
follows the procedure outlined below. 

a. Form the eigenvector matrix such that 


m m 

FC N F =1 ; F Cp F = A 


Let 


d = F Y 


pF ‘ 


b. Choose the columns of G as those columns of F for which 


d 2 + X - In X > d 2 + \ - to A for all 

r r r q q q 


q = 1,2, •••,(p+l)m ; r = 1,2, •••,t . 


(A-3-107) 


The I-divergence for the t selected parity relations is 

i t 

I FN (p,t) = 2 j-j ^r + X r + ta \ ~ ^ * (A-3-108) 

As with the divergence and Bhattacharyya distance this approximation works 
well when the term related to the mean can be expressed by a small number of 
eigenvectors chosen according to Eq. A-3-107. Alternatively, when the mean 
related term is dominant, we select the first parity relation according to 
Eq. A-3-70 with C = [C^ + Cp]/2 and the remaining (t-1) parity relations as 

the eigenvectors of C N C p corresponding to the largest (t-1) eigenvalues. 
However, the columns of G are not orthogonal in the latter case. 
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It should be noted that the single failure results presented in this section 
are also applicable to multiple failure modes, if we treat all the failure 
modes besides the one we wish to isolate as uncertain normal-mode behavior. 

A-3.4 EVALUATION OF ALTERNATIVE FDI SCHEMES 

As discussed in Section A-l, a measure of effectiveness of an FDI is its 
ability to focus information among its residuals. That is, the effectiveness 
of an FDI system is related to its ability to "collect" relevant failure in- 
formation quickly and accurately. Our information distance measures based on 
the I-divergence, the J-divergence , and the Bhattacharyya distance provide 
convenient tools to evaluate the effectiveness of alternative FDI schemes. 

In this section, we evaluate two different FDI schemes: one based on Kalman 
filter (KF) generated residuals and the other based on the parity space 
approach. 

Consider a Kalman filter based on nominal system parameters (A q, Cq, Dq, 
QO, Ro)« Let the innovation representation of the system be given by 

x(k+l) » A Qk x(k) + K f (k) v(k) (A-3-109) 

a 

v(k) = -C Q x(k) + y(k) (A-3-110) 

where 

K f (k) = A Q Z 0 (k|k-l) Cj (C 0 Z 0 (k|k-l) Cl + R 0 ) _1 (A- 3-111) 
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Z Q (k+l|k) = (AQ-KfOc) C Q ) Z 0 (k|k-1) (AQ-K f (k) Cq) 1 


+ DqQqDq + K f (k) R q K f (k) 

(A-3-112) 

V - *0 * K f (k) C 0 

(A-3-113) 

v(k) = Filter residuals 

(A- 3-114) 

a 

x(k) = Predicted state estimate 

(A-3-115) 


Suppose, we have an innovation sequence from the Kalman filter: 

vN = [v(l), v(2) , ••*, v(N) ] (A- 3-116) 

where we have assumed, without loss of generality, that the sequence starts at 
time step k=l. Consider a similar sequence of parity relations generated from 
the parity space approach: 

P N = [p(l)» p(2) , •••, u(N)J . (A-3-117) 

Let the corresponding decision statistics (e.g., weighted sum-squared resi- 
duals, whitener-correlator , etc.) for detecting a failure F be Sf(v^) and 
Sp(pN), respectively. Then, the J-divergence and/or the Bhattacharyya distance 
of Sf(v N ) between failed and unfailed modes of system operation, and similar 
distances for Sp(v^) provide measures for evaluating the two FDI schemes, in 
terms of their ability to distinguish the two hypotheses. We will illustrate 
the evaluation procedure for the whitener-correlator and the weighted sum- 
squared residual decision statistics. The evaluation procedure can be 
extended to multiple hypotheses in a straightforward manner. 
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A-3.4.1 Evaluation of Whitener-Correlator Decision Statistic 


Assume that the Kalman filter based and the parity space based residuals 
be accumulated over time via a linear algorithm: 


s f (v N ) = i W?(k) v(k) 
r k=l 


(A-3-1 18a) 


S 

P 


(P N ) 


1 W^(k) y(k) 
k-1 P 


(A-3-1 18b) 


where Wf(k) and Wp(k) are selected to minimize some detection criterion. The 
problem is to evaluate the average distance measures over all uncertain models 
£ - between the conditional distributions p(Sf/F) and p(Sf/N) for 

the Kalman filter based FDI scheme and p(Sp/F) and p(Sp/N) for the parity 
space based FDI scheme. These measures are given by 

I PN > = J. P £ 4n/£ i=f ’ P (A-3-1 19) 

J FN = F i J FN/ i i=f >P (A-3-120) 

" jj ? l B ra/» 1=f ’ P (A-3-121) 

where the superscript i (=f,p) denotes that the evaluation is being made for 
filter based or parity based FDI schemes, respectively. The conditional dis- 
tance measures based on model £ is: 
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[ (i) 
FN/ £ 


2 2 
Omj ft Or 


1 r» r UNi) ^ x 

- [to (- 2 — J + J 


- l 


Fi£ 


Ni £ 


2 , 2 


+ ^Fii ~ * 1 f,P 


(A-3-122) 


J (i) 

FN/£ 


- I [~2~ + ! 2~ ' 2 

°Fi£ a Ni£ 


+ ^Fi£ “ + ^ ’ 1=f,P 


°Ni£ °Fi£ 


(A-3-123) 


i (i) = 
FN/ £ 


l , r^Fito ^* 5 ^°Ni£\ 0 * 5 , i 


Ni£ 


Fi£ 


+ 4 ( S Fi£ ~ ^Ni£^ ^°Fi£ + °Ni£^ ; 1 f,P 


(A-3-124) 


2 2 

where a_ . „ and o, T . „ are the variances of the decision statistic under failed 
ri£ Ni£ 

and normal hypotheses, respectively for the filter based or parity based FDI 
scheme under model £. Similarly, S pi £ and S Ni ^ are the corresponding mean 
values. 


EVALUATION OF MEAN AND VARIANCE OF DECISION STATISTICS 

The mean values of the decision statistics are given by 


N 


S Ff £ = l V k > v F£ (k) 
k=l 


(A-3-125) 
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where 

^F£ (k) = E i { v(k)/F l * (A-3-126) 


The mean v F ^( k ) can be evaluated from the 2n-dimensional augmented system of 
equations (see Eqs. A-3-3, A- 3-4 , A-3-109). 


x(k+l) 


x(k+l) 


__ _ 





Kf ( k ) C F£ A c 



— — 


— — 


“ — 


x(k) 


D F£ 


I 



+ 

w(k) + 



x(k) 


0 


0 


__ __ 





__ 


d p (k) 


K f (k) 


[v FA (k) + b F (k> ; 


(A-3-127) 


V F£ (k) = t C F£ “ C 0 ^ 


x(k) 

x(k) 


+ v p£ (k) + b F ( fc ) 


(A-3-128) 


Denoting x (k) = (x(k) x(k)), Eqs. A-3-127 and A-3-128 can be rewritten as 

cl 


x a (k +D = A a£ (k) x a (k) + D a£ w(k> + B a£ d F (k) 


+ K a£ (k) f V F£ (k) + b F (k) ^ (A-3-129) 


V k) = C a£ X a (k) + V F£ (k) + b F (k) 


(A-3-130) 
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where A . , D . , B , K . and C 0 have obvious definitions from Eqs. A-3-127 

3 J6 3X 3X 3 Jt. 

and A-3-128. Mean "^ F£ (k) and its covariance fip £ (k) can be evaluated from Eqs. 
A-3-129 and A-3-130 as 


V k) " C a£ x a (k) + b p (k) (A-3-131a) 

= C ai S a£ (k) C ai + R p£ (A-3-131b) 

where 

x a (k+l) = A a£ (k) x a (k) + B aJl d p (k) + K a£ (k) b F (k) (A-3-132a) 


E a« (k+1) - A a* (k) 


E a t (k > 


T 

A a*< k > 


D . Q D „ 
ai a SL 


(A-3-132b) 


Similarly, one 
variance of the 


can compute v N£ (k) and for the norma l mode. 

2 

decision statistic is computed from 


The 



l l Wj(k) E £ {(v(k) - v F£ ) (v(j) - v F£ (j)) T /F} W f (j) 
k=1 J =1 (A-3-133a) 


N N 


_T 


-l l »J(k) [E„{v(k) v A (j)/F} - v p£ (k) v p4 (j)] W f (j) . 


k=l j-1 


(A-3-133b) 


When v(k) sequence is uncorrelated, a!l, takes a simple form: 

1 1 x 


2 

a Ff£ 


N 

I 


k=l 


W f (k) %i. W W f (k) 


(A-3-134) 


A-98 



However, v(k) is correlated in uncertain systems and, therefore, we must eval- 
uate the cross-correlation terms of the form (v(k) v^(j)/F). The computa- 
tion of the cross-correlation term proceeds as follows. From Eqs. A-3-129 and 
A-3-130, we have 


C ai r a ( J> C a* > k > J 


Ej |v(k) v T (j)/F) =] Q^Ck) ; k-j 


(A- 3-135) 


C a« £ a (k) »L<J- k > C L : J >k 


where 


k-1 

4 at (k -J> ’ r Y«« (C) 


Similar expressions can be obtained for the normal mode of operation by re- 

_ 2 

placing subscripts F£ by N£ in Eqs. A-3-131 through A-3-135. Once (S Ff 
2 

and ( S N ££> °Nf£) are obtained* it is a simple matter to compute the distance 

measures in Eqs. A-3-119 through A- 3-124. It is straightforward to evaluate 
2 

S„ . and a, „ for the parity space based decision statistic in Eq. A-3-118b 
Fp£ Fp£ r j r . 

The result is 


V ■ J, wT P (k> eT W k) 


(A-3-136) 


2 

°Fp£ 


N 


I 

k=l 


N 


l 

j = l 


wjoo g t [E t {Y p (k) Y p (3) |f] - f pFt (W f pFt (j>] G w p <3) . 

(A-3-137) 


Although a little teiius, the mean term Yp Fj ^(k) and the correlation term can 
be computed from Eqs. A-3-3 and A- 3-4 in a straightforward manner. Similar 
expressions hold for the normal mode of system operation. 
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A-3.4.2 Evaluation of Weighted Sum-Squared Residual (WSSR) Decision 
Statistic 


The WSSR statistic for the two approaches is given by 

N N 

S f (v ) « l v (k) W f (k) v(k) (A-3-138) 

k=l 

„ N 

S (u ) = I U (k) W (k) p(k) . (A-3-139) 

F k=l v 

For a given failure mode F and model £, the statistics in Eqs. A-3-138 and A- 
3-139 have non-central chi-squared distribution. However, for the number of 
samples N > 15, the statistics in Eqs A-3-138 and A-3-139 are approximately 
Gaussian. The computation of mean values is relatively easy: 


S Ff£ “ 1 tr [W^OO («p f £ ( k > + \ f£ (k> v F f £ (k) ^ (A-3-140) 

k 1 

S Fp£ - l tr [Wp 1 (k) G T E £ |Y p (k) Y p (k) /F }G ] (A-3-141a) 

k - ■ 1 

N 1 T T 

= Jj tr [W P (k> G (£ F (k) + V (k> Y pF (k))G] ‘ (A-3-141b) 

The computation of covariance proceeds as follows: 

4u - \ t s f <*>> 

4 n ’ h l s p ^ 


t - 


4u 


1- 


S 2 
Fp£ 


(A-3-142a) 


(A-3-142b) 
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We now compute the mean-squared value required in Eq. A-3-142: 


E £ {S^(v N )/F} 


E { X l v T (k) W f 1 (k) v(k) v T (j) W 1 (j) v(j)/F} 
* k=l j=l 


(A-3-143) 


In order to evaluate the expectation in Eq. A-3-143, we need the result of 
Theorem 3.3. 


Theorem 3.3 - Let and X£ be Gaussian random vectors of dimension n with 
means x^ and x 2 » Their covariances are E^ and E 2 » an< * the cross-covariance 
between x^ and x 2 is Then 


E 


{x^ A x 


1 2 


} - 




+ 2 tr (A E 12 



) 


+ tr (A Ej) tr (B E 2 ) . (A-3-144) 


Proof : Note that 


T . T „ 
x 1 A x l x 2 B x 2 


n n n n 

= 1111 

i=l j=l £= 1 m=l 


3 ij V X li X lj 


X 2£ X 2m 


(A-3-145) 


In order to evaluate the expectation of Eq. A-3-145, we need to find the ex- 
pectation of the product of four non-zero mean Gaussian random variables, 

E {x x,. x„„ x. }. To compute this expectation, we define 
11 lj 2£ zm 


v T - (Vj v 2 v 3 v 4 ) - (x u x 13 x 2t x 2b ) 
T 

“ = (ftlj w 2 W3 co 4 ) . 
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Let the covariance of v be E v and is of the form 




°lii 

a uj 

°12i i °12im 



°1U 

°ijj 

°12j£ °12jm 

I 

IT 
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°l2i£ 

a 12j l 

°2 U °2£m 



°l2im 

°12jm 

°2 tei °2mm 

The characteristic function of 

v is 




$(m) 

T 

= E {exp (-jrn v) } . 

Using the fact that 

v is 

normal 

, we have 





T_ 

1 T N 


4>(w) = exp 

(-j m v - 

- to E v <u) . 

From the property 





a 4 4>( oj) 


- J 4 E 

< V 1 v 2 v 3 

V !|) “ E l X li 

3 u> 3m Bui. 3 ui 
12 3 4 






we have 





E l x li x lj x 2* 

x ] 

= X. . 

X. X A x 
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li 

lj 2£ 


,1 (A- 3 


+ °lij °2ten + °12i & °12jm + °12im °12j£ * ( A_ ~ 


-146) 


-147) 
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Using Eq. A-3-147 in Eq. A-3-145, we have 



A x. 



n n 

l l 


i=l j=l 


n n 

E J a ij V *lj *2* *2m 

£=1 m=l 


+ °lij °2to + °12i£ °12jm + °12im °12j£^ 

T - -T 

= [Xj A x^ x 2 B X 2 J 


+ tr (A Z B S 2 ) 

+ 2 tr (A E B E* 2 ) . 


Using the result of Theorem 3.3 in Eq. A- 3-143, we have 

E * t S £ <vN) I F ) " 1| tr [ H f 1(k > "Ff2 <k) ' , F£« <k) H 2 

+ {I tr [w*\k> Oj.jjW]! 2 

K 

+ 2 ll tr [U~ l M [E £ {v(k) v T (j)|F} - v pf£ (k) v F ^(j)] 

j k 

Wf 1 ^) [E £ {v(j) v T (k) | F } - v Ff4 (j) v F f £ ( k >]] . 

(A-3-148) 


The required cross-correlation terms are defined in Eqs. A-3-129 through A- 
3-135. The computation of mean and covariance for the normal mode KF based 
FDI scheme and similar quantities for the parity space based FDI scheme is 
straightforward. Thus the statistical distance measures, introduced in this 
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section for parity optimization, provide convenient measures for the eval- 
uation of alternative FDI approaches as well. Note, in particular, that the 
evaluation measures are nonstationary and, hence, we have a mechanism to 
determine how fast failure information is being accumulated in various FDI 
schemes. 
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APPENDIX B 


ROBUST ACCOMMODATION 

Our general methodology and specific analytical results have focussed on 
robust failure detection and isolation (FDI) as the most important aspect of 
the overall FDIA problem. In this appendix, many of these results are inter- 
preted and extended to solve the robust accommodation problem. 

The sensor accommodation problem is one which ultimately must include the 
performance goals and trade-offs of the entire system under consideration. 
Excellent examples of these kind of considerations are given in [3] and [11]. 
One appealing means of accommodating sensor failures involves the estimation 
of the variables which are no longer available. This method is employed in 
[3] and [11], however control system modifications are also required because 
of the inaccuracies of the particular estimation algorithm which was employed. 

As in the case of FDI, modeling errors play a large role in the ultimate 
performance of any estimation algorithm. Therefore, estimators employed in 
accommodation algorithms should be designed for insensitivity to modelling 
errors where possible. 

The first step in designing any estimation algorithm is the determination 
of a model for the desired variables. 

Consider the discrete time state space description, 

x(k+l) = A x(k) (B-l) 
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y(k) = C x(k) + n(k) 


(B-2) 


Input terms in B-l and B-2 can be included in the results which follow and 
similar results derived. 

In Section 4 and Appendix A, we discussed the formation of parity checks 
based on such models that take the form v(k) = GYp(k) , where Y p (k) is a p- 
window of m dimensional observations, y(k), and G is the tx(p+l)m parity check 
matrix. Furthermore, we showed how each parity check (which is normally a 
zero mean white noise sequence) could be interpreted as either an ARMA model 
of the ith element of y(k), y^ , or as a state space model of a p-window of 
Yi(k). 

When modeling errors exist in the parameters (A,C) of Eq. B-l, the robust 
parity checks can be interpreted as robust ARMA or state space models for 
y^(k) . These can then be used in an estimation algorithm as described below. 

In order to estimate sensor values both during no-failure and with a bad 
sensor, we can design an estimator which is; 1) robust to modeling errors by 
using parity checks as ARMA models, and 2) uses only those sensors which are 
validated by the FDI algorithm. Let the parity check which is used for 
estimating yi be written in terms of the y± ARMA model; 

P P 

yi(k) = l UL yi(k-j) + l l Uj y£(k-j) (B-3) 

j=I **i j=0 

Then, during normal operation, we have a closed loop estimator for all sensor 
values y^, i=l N-sensors; 

yi(k) = 7i(k) + K[ yi (k) - 7i(k)] (B-4) 
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where 


p P 

yi(k) = [ Uj yi(k-j) + l l yj y£(k-j) (B-5) 

j=l itt j=0 

When sensor y^ is identified as failed we must change the estimates for all 
sensor values. For y^, we perform the "open loop" scheme, 

P P P 

yi(k) = l u>j yi(k-j) + l l Pj y£(k-j) (B-6) 

j=l l*L J"0 

For y£ such that i*£ we use ABMA models (parity checks) for y £ that don't 
include yj in a closed loop estimator of the form (2) and (3), or, use the 
open loop estimate for y^ in computing y£. 

Although the interpretation of parity checks as robust ARMA models can be 
utilized in forming models for various estimation algorithms, a more direct 
solution to the problem of finding robust ARMA models for the measured vari- 
ables can be developed. Consider the state space description 


x(k+l) = A x(k) 

(B-7) 

y(k) = C x(k) 

(B-8) 


with p(A = A £ , B = B £) = p^ , and suppose we wish to find a pth order AR model 
for the vector y(k); 


P 

y(k) = l y(k-j) 

j=i 


where is an mxm matrix. 


(B-9) 
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As in our approach to parity generation, we would like a metric which is 
related to estimation accuracy and is easily optimized by choice of the 4>j. 
Consider the mean square estimation error defined by 


J = |y( k ) - y( k >l I 2 (B-io) 

where the ||*|| operator is the standard Euclidean distance. 

Rewriting Eq. B-10 in terms of the state variable x(k-p) using B-7 and 
B-8, we have, 


P r p - j 2 

J = E £ {| | [C j?A£ - l ®jC£A£ ] x| I } 

j=l 


(B-ll) 


Since we are interested in solutions which are independent of the state, 
x, the problem of choosing reduces to, 


min E £ { ||C*a/ - l CjeA i j || } 

J-l 


(B-12) 


The expectation operator can be taken inside, the gradient with respect to 4>j 
taken and set equal to zero yielding the following. 


Let, 


M £ 


c £ A* 


p-1 


Cfc A* 


C£ 


P-2 


(B-13) 


bi 


0 A p 

* ith row of Cj/ii 


(B-14) 
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C-0 


— 4>l — 

<t>2 


(B-15) 


A 

9 = [ 4>i | <&2 1 


K] = 


Define, 

C = E £ {M £M (T } 
bi = E £ {M £b^ ^ } 


(B-16) 

(B-17) 


Then the solution which minimizes D-12 is given by the equation 

C <1* = bi (B-18) 

which is easily solved for <J>^ providing C is nonsingular. When no modeling 
errors are present, C is less then full rank implying that more than one 
solution to D-18 is possible. 

Finally we note that the above result is strongly related to the Levinson 
filtering algorithm [17] which utilizes the assumption of stationarity to 
derive a statistically based metric that results in linear equations for the 
AR coefficients and are based on the Toeplitz autocovariance matrix of the 
sequence y(k). 
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SOFTWARE DESCRIPTION 

The robust FDI software is designed to be used for the development and 
evaluation of detection algorithms for sensor failures in dynamic systems. It 
was developed on ALPHATECH's VAX 11-750, in VAX-11 FORTRAN-77. 

The software consists of four separately executable programs: 1) DESIGN 

computes parity check vectors, by one of several methods, from the dynamic 
system matrices; 2) PARCEV evaluates the parity check vectors for their abil- 
ity to distinguish sensor failures from each other and from no failure; 3) 
KFRES evaluates the ability of a Kalman filter to distinguish sensor failures 
from each other and from no failure; and 4) SIMULATE simulates the states, 
controls and outputs (sensor measurements) of the dynamic system over time, 
including sensor failures if desired, and computes residuals, using parity 
vectors or a Kalman filter, in order to detect and identify sensor failures. 
The software is modular and well-commented for easy modification, checks for 
faulty input data that would otherwise cause a program to crash, and alerts 
the user to problems encountered during execution, such as an attempt to 
invert a singular matrix. Each program is now discussed in detail. 

C-l . PROGRAM DESIGN 

The dynamic system equations underlying all of the robust FDI software 
are of the form: 

_x(k+l) = A ]c(k) + B u(k) (C-la) 


C-l 



(C-lb) 


y(k) = C x(k) + D u(k) + v(k) 

u i(k) = / qi • N(0, 1) (C-lc) 

with white noise vectors: 

Vi(k) = /TT- N(0,1) (C-ld) 

where _x is the state vector, u is the vector of controls, y_ is the output vec- 
tor; A, B, C and D are the system matrices; _r is the measurement noise covar- 
iance vector, and is the control noise covariance vector. In order to make 
use of these equations, several parameters must be defined. First, the dimen- 
sions of the system are needed: the number of states NS, the number of con- 

trols NC, and the number of outputs NO; these parameters determine the sizes 
of the system matrices and noise vectors, as well as the state, control and 
output vectors. Also, the order NP of the desired parity check vectors, which 
determines the time window length to be used in residual generation, must be 
specified. Finally, the number of system models NL is needed; this refers to 
the fact that randomly perturbed system matrices, A^, B^, C^, and D£ are com- 
puted from the true matrices for Z = 1, •••,NL, where, for example 

A£ = A + 6A (C-2a) 

and 6A is a matrix of zero-mean Gaussian perturbations. Each system model has 
an associated probability P % of being true, such that: 

NL 

Z P* = 1 (C-2b) 

if - 1 
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In program DESIGN, the formation of parity vectors from the system 
matrices and noise vectors proceeds in two steps: first, a composite system 

matrix C c is formed; and second, the parity vectors are computed from the C 0 
matrix. Each of these two steps may be performed by two different methods. 
The C Q matrix may be formed by either the Null Space method or the Auto- 
correlation Function method; while the parity vectors may be computed by 
either the Robust Residual method or the Robust Detection method. Each of 
these methods is described below. 

Both methods of computing the C Q matrix begin by forming two matrices M £ 
and r £ from randomly perturbed system matrices, as follows: 


M £ = 


V 


c 

c j Cl 


(C-3a) 


h 


D£ 

C £B£ 

C£A*B£ 


0 

D* 

C£B£ 


0 

0 

D£ 


0 

0 

0 


(C-3b) 


NP-1 NP-2 NP-3 

C StAl ®£ C£A£ B £ C£A£ B £ ••• D £ 
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If the Null Space method Is being used, the matrix Njj is then formed: 


N 


i 




and C Q is computed by the following sum: 


C 

o 


NL 

E 

fc=l 


T 

N N „ 
l l 


(C-4a) 


(C-4b) 


If, however, the Auto-Correlation Function method is used, then C Q is set 
equal to the Auto-Correlation Function matrix (ACF), which is computed by a 
more complicated procedure. First, the output and control noise covariance 
matrices are set: 


R P = 


R 

0 

0 


0 

R 

0 


0 

0 

R 


0 

0 

0 


(C-5a) 


Q 


o 


c = 

u 


0 


0 0 ••• 0 

Q 0 0 

0 Q 0 


(C-5b) 


0 


0 
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where Rp contains NF+1 copies of R, Cu contain o NP+1 copies of Q, and 


Q = diag [ £ ] 


R = diag [ r ] 


(C-5c) 

(C-5d) 


Next, for each system model Z, the following discrete time Lyapunov equation 


must be solved for 0* , the state covariance matrix: 

£ 

C x. * A i C x. A i +B Xi . 


(C-6) 


Then, the output covariance matrix fc v ) and the cross-covariance matrix 

y z 

(Cyu. ) can be computed : 


C = MO M* + r 0 c + R 
y ^ t x £ z z u z p 


(C-7a) 


c = r c 

yu £ Z u 


(C-7b) 


Finally, the Auto-Correlation Function matrix can be formed: 

NL 

ACF = Z P • 

Z 

Z=l 

and the C Q matrix may be set equal to the ACF matrix. 

Once the C Q matrix has been formed, using either the Null Space or the 
ACF method, the parity vectors may be computed in two ways. Under the Robust 
Residual method, the eigenvalues (Aj) and normalized eigenvectors (wj) of C 0 
are computed, and the eigenvectors with smallest eigenvalues are used as 


y u , 


y u „ u 


(C-8) 
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parity check vectors. If the Robust Detection method is used, then for j = 


!,••*, NO the vector bj must be formed as 



(C-9) 


where bj contains NP+1 copies of the unit vector ej in the j-th direction of 
size NO, and NPJ-1 copies of the zero vector of size NC. Then, the parity 
vectors and associated values are computed as follows: 


£ j = 



no -1 II 

o 


(C-lOa) 


A. 

J 


T 

— j C 0 wj • 


(C-lOb) 


Note that both _bj and wj have dimension (NP+1)(N0+NC) (like the dimensions of 
the C Q matrix). In both methods, small values of Aj indicate that residuals 
computed from wj will tend to be close to zero under normal operating 
conditions . 
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Another desirable feature of parity vectors is that they have a reason- 
ably large "signal-noise ratio." The signal-noise ratio of a parity vector w^ 
is computed by DESIGN as: 

T 

bj • BFMj 

S/Nj j = ■ (C-ll) 

V w^ - (ACF) - ^ - 

for each sensor output j = 1,***,N0, where BFMj is the bias failure magnitude 
for the j-th sensor. 

The basic structure of subroutine calls by DESIGN is given in Fig. C-l, 
although no attempt has been made to show multiple calls to the same routine, 
and calls to standard library routines for matrix operations are not shown. 
Table C-l gives a brief description of the purpose of each subroutine and 
function (including library routines). The inputs and outputs to DESIGN are 
described below. 

A sample input file to DESIGN is shown in Fig. C-2. The first line of 
data read in is for NS, NO and NC, the number of states, outputs and controls. 
Next to be read in are NPMAX the maximum order of the parity checks, and 
NP__ALL, a flag indicating whether to use all values of NP; if NP_ALL is true, 
DESIGN computes parity checks of order NP = 0, •••, NPMAX. Then N_PAR_CHECK 
specifies the number of parity checks desired; if the Robust Residual method 
is used, these will be the ones with lowest eigenvalues; if the Robust Detect- 
ion method is used, they will be the ones of lowest order. (Usually, if the 
Robust Detection method is used, N_PAR_CHECK should be set to (NPMAX+1)N0 in 
order to print all of the parity checks generated.) 
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Figure C-l. Subroutine Calls Made by Program DESIGN, 

Excluding Library Routines 

The next two lines of input data specify the generation of perturbed 
system matrices and the methods of parity check generation to be used. First, 
NL, TRUE_MODEL and SEEDO are read in. NL is the number of randomly perturbed 
system models used (the models are assumed to have equal probability of being 
correct, so P$, = 1/NL, for £ = 1,«**,NL). TRUE_MODEL is a logical variable 
(true or false) indicating whether the first system model (£^1) should use the 
unperturbed, correct system matrices; if TRUE_MODEL is false, the matrices for 
the first system model are randomly perturbed, as in the other models. SEEDO 
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TABLE C-la. SUBROUTINES CALLED 


Below la a 
DESIGN, PARCEV, 
Cures. 

ACF_HAT : 

DSVDC: 

M_GAMMA: 

OUTPUT: 

PERTURB: 

PLOT: 

PROB-DIST: 

RANPERT: 

READIN: 

READJCFR: 

READ_PAR: 

READ_SIM: 

RNORM: 

SN_RATIO: 

SORT_STACK 

SPRT: 

SYS_COVAR: 

WSSR: 


brief description of the purpose of every subroutine called by 
KFRES or SIMULATE. See Pigs. I 3, 4 and 6 for the call struc- 


conputes the Auto-Correlation Function (ACF) matrix 
(see Eq. E-8) 

LINPACK library singular value decomposition routine 
(used to compute eigenvalues and eigenvectors of 
symmetric C 0 matrix) 

computes the matrices M and T (see Eq. E-3) 

writes the output files for DESIGN, including parity 
vectors and slgnal-nolse ratios 

creates a perturbed version of the system matrices 

ALPHATECH PLOT library routine that creates data files 
for later ploctlng 

computes the Bhattacharyya distance and correlation 
between two multivariate Gaussian probability distributions 
(see Eq. E-15) 

function that computes a random perturbation 

reads the DESIGN Inputs from an Input file and echoes 
to an output file 

reads the KFRES Input file 

reads the PARCEV input file 

reads the SIMULATE Input file 

returns a random number from a Gaussian normal distribution 

computes the signal-noise ratios for the most recently 
computed parity vector and each sensor output 

sorts an array of real numbers, keeping an array 
of Integers In parallel 

computes sequential probability ratio statistics 
(see Appendix B) 

computes the covariance of the Kalman filter Innovations 
for a given (perturbed) system model (see Eq. E-21b) 

computes weighted sum of squared residuals statistics 
(see Appendix B) 
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TABLE C-lb. LIBRARY ROUTINES 


The following are library routines for matrix manipulation used by DESIGN, 
PARCEV, KFRES and SIMULATE (not shown in Figs. C-l, 3, 4, and 6). 


DGECOM : 

LQGALPHA library 
determinant of a 

MADD: 

LQGALPHA library 

MLINEQ: 

LQGALPHA library 
(used for matrix 

MMUL: 

LQGALPHA library 

MSUB: 

LQGALPHA library 
from another 

RICDSD: 

LQGALPHA library 

SAVE: 

LQGALPHA routine 
another matrix 

TRNATB : 

LQGALPHA library 
of a matrix 


routine used to compute the 
matrix 

routine for adding two matrices 

routine for solving linear equations 
inversion) 

routine for multiplying two matrices 
routine for subtracting one matrix 

routine for solving Riccati equations 
for copying a matrix into a part of 

routine for computing the transpose 


is the initial integer value of the random number generator seed. Next, the 
logical variables ACF_FLAG and RD_FLAG are read in. ACF_FLAG indicates 
whether to set the C Q matrix equal to the Auto-Correlation Function (ACF) 
matrix; if ACF_FLAG is false, the Null Space method of computing C Q is used. 
(The ACF matrix is computed in either case since it is needed to compute the 
signal-noise ratio.) RD_FLAG indicates whether the parity vectors should be 
computed by the Robust Detection method; if RD_FLAG is false, the Robust 
Residual method of computing parity vectors is used. 
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DESIGN* INPUTS 


HS> NO* NC 

4 5 5 

NF'HAX > HP. ALL 

0 T 

N. PAR. CHECK 

s 

--- ML. TRUE .HOl'EL • SEEPO 

20 P 23945? 

ACf .FLAG • RP.FLAG 

F T 

0 (IIC ) <346 PPM. 0. 0. 0. 0) 

5.343E-04 t.OE-08 1.0E-03 1.0E-08 I.0E-08 

RCIIO) <14 RPM. 12 RPM. .67 PSI. .09PSI. 2 PEG F) 

1.96E-06 6.49E-07 l.SE-06 4.86E-07 1.82E-06 

*FH<NO> 

5.0000000E-03 3 . 3333333E-03 5 . 4545455E-03 2 . 3076923E- 02 5 . OOOOOOOE-03 

PEL.T 

0.02 

A (MS* (IS) 

0.8776392 9 . 54 48047E-02 -7 . 6 1 8 1 483E- 03 3 . 21 2551 0E-02 

2.3492503E-03 0.9334235 2. 901 1461E-02 -S . 4782876E-02 

3 • 8686348E-04 - 1 . 40 4 4064E-0 4 0.9854547 1 . 1 503949E-04 

3.921 16I4E-02 *9. 333141 0£ -04 -7. SOS5437E-04 0.9563294 

b (MS* NC> 

4 . 4787161E-02 4.41 14124E-02 - 1 . 5856765E-02 2 . 9632282E-03 -3 . 6747955E-02 

2. 1774249E-02 1 . 4993310E-02 - 1 . 2 1 06993E -03 -2 . I 52S344E -03 - 1 . 267751 6E-02 

1 .4233845E-03 9 . 6010260E-04 - 7 . 8 1 24 1 2SE -05 -2 . 4199684E-06 9 . 1238425E-04 

1 . 7201 1 10E-03 5. 162831 6E -04 2 . 1 0821 I 6E -04 - 1 . 66006 1 9E-05 3 . 724 4"»93E - 03 

C(NO.NS) 

1.0000000 0. OOOOOOOETOO 0.0000000000 O.OOOOOOOEIOO 

0. OOOOOOOETOO 1.000000 0.0000000000 O.OOOOOOOEIOO 

-3.61 991 79E-02 0.8662740 - 1 . 753e831E-02 - 1 . 42561 81 E-02 

0.2330691 4 . 181 41 16E-03 - 2 . 3846S35E-02 -2 . 1576596E-02 

- 1 . 380429 -0.6447408 -0.2106439 2 . 4 1 3<47eE-0? 

PINO. NC> 

O.OOOOOOOEIOO 0 . OOOOOOOETOO 0 . OOOOOOOC T 00 O.OOOOOOOEIOO O.OOOOOOOEIOO 

0. OOOOOOOETOO 0. OOOOOOOETOO 0. OOOOOOOETOO O.OOOOOOOEIOO 0. OOOOOOOETOO 

0.2299521 -0.4200383 2 . 993 1 962E-02 4 . 63081 38E-03 -0.7l5474e 

0.1144533 -0.5347052 4 . 3202233E-02 1 . 8SS2763E-&4 1.2127*2 

0.9383181 0.5814369 - 1 . ee35*0CE -02 -3 . 4454 1 53E-03 0..3496292 

PEL. A 

1 .53e0000E-02 2 • 1 069999E-02 1 . 40 1 1 1 07E-03 1 . 21 69647E-03 

3. 1333333E-03 9 . I 000004E -03 8 . 753605 1 E- 04 I . 94367-3 l E - 04 

{ . 7636009E-03 1 . 1 24 1082E-03 2 . 4 200000C-03 1 . 1 20e0C0C -03 

4 . 4277371E-03 2 .8 1 02705E -03 2 . 7037999E -03 7 . 279999 j E -03 

PEL .8 

8.5195494E-03 1 . 6469 1 74E -03 5 . 1 8437CeE -04 2 . 206521 6E-04 e . 64 1 6323E-03 

2 . 4267395E -03 4 . 56 1 7996E -04 1 . 5556 1 02E- 04 0 . 5336 4 73E -OS 2 . 64 1 3030E-0-I 

8 . 5c659 1 4E-04 1 . 7633420E -04 5 . 5528 3e 4E-05 1 . 60 l 2 l 65E -05 B . 5322355E-04 

2 . 1 998493E -03 4 . 49 1 71 96E -04 1 . 3978 134E -04 3 . 9906063E -05 2. 1 532V46F. -O.'l 

PEL.C 

O.OOOOOOOEIOO 0. OOOOOOOETOO 0. OOOOOOOETOO 0 .OOOOOOOETOO 

0. OOOOOOOETOO O.OOOOOOOEIOO 0. OOOOOOOETOO 0. OOOOOOOETOO 

6 . 8665206C -02 0.1231569 3 . 7289 455C- 03 4 . 64904 1 2003 

3 . 646S530E -02 1.930634 1 . 9 1 7 1 6 1 9E - 03 3 . 1 758063E - 03 

0.1 172195 9.326351 4E-02 9 . 956000 4 E -03 1 . 2030000C - 02 

PEL . P 

O.OOOOOOOEIOO O.OOOOOOOEIOO O.OOOOOOOEIOO O.OOOOOOOEIOO O.OOOOOOOCTOO 

O.OOOOOOOEIOO O.OOOOOOOEIOO O.OOOOOOOEIOO O.OOOOOOOEIOO 0. OOOOOOOETOO 

1 . 1 51 e222E-02 3 . 2224 100C- 02 i . 029359 IE - 03 3.0286349C 03 0.1696220 

6 .5602000E-02 ? . 4G39703E-02 3 . 727VDOOE -03 2.71 6532SE -0 4 1 . 1 060939E -02 

6 . 3202062C -02 4 .040461 4E-02 e . 7e60543E - 03 3 . 29024 25E -03 0.2e4054C 


Figure C-2. Inputs to Program DESIGN. 
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The next two inputs are the noise parameters. Q is an NC-vector which is 
the diagonal of the control noise covariance matrix; R is an NO-vector which 
is the diagonal of the measurement noise covariance matrix. (Off-diagonal 
elements of both matrices are zero.) BFM is an NO-vector which contains the 
bias failure magnitude for each sensor; and DEL_T is the time increment. 

The final inputs to DESIGN are the correct system matrices, A, B, C and 
D, and the matrices containing the standard deviations of each element for 
random permutation, DEL_A, DEL_B, DEL_C and DEL_D. Matrices A and DEL_A have 
dimensions NS by NS; B and DEL_B are NS by NC; C and DEL_C are NO by NS; and D 
and DEL_D are NO by NC. 

The primary output file for DESIGN contains an echo of the input file, 
and the parity check vectors and related information: the number of the par- 

ity check (i=l, . . . ,N_PAR_CHECK) , the order (NP), the eigenvalue (Xj), and the 
vector itself (wj). In addition, there is a secondary output file which pro- 
vides some extra information. The secondary output file includes an echo of 
the input file, the trace of the C 0 matrix (for each NP), a neater version of 
the parity vectors but with less precision, and, for each parity vector, the 
standard deviation, the sums of the elements which will be used as coeffi- 
cients of each sensor output, and the signal-noise ratios. 

C.2 PROGRAM PARCEV 

Program PARCEV evaluates a set of parity check vectors, such as those 
generated by program DESIGN. PARCEV begins by forming a matrix containing the 
parity check vectors being evaluated, as follows: 
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T 

wi 


W = 


T 

W2 


(C-12) 


T 

^ _J>AR^_CHECK 


Next, the Auto-Correlation Function Matrix (ACF) is formed, as in DESIGN (Eq. 
C-8), and the covariance matrix of the parity checks is computed: 

C v = W • ACF • WT (C-13) 


Then, for each parity check vector w^, the sum of the coefficients of each 
sensor output j are computed: 


C H = w i + wf + . . . + Wi for i = 1,...,N PAR CHECK 

j — ‘•j -^NCH-j -^NP.NO+j > ~ 


j - 1 NO 


(C-14a) 


and from these coefficients, the expected, or average, residual vector is 
formed, for each possible sensor bias failure, positive, negative or zero (no 
failure), as follpws: 


Vj = BFMj 


C i,l 

c i ,2 


, for j = -NO, ... ,-1,0,1, ... ,N0 


c i ,N0 


(C-14b) 
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where: 


BFM(- j) = -BFMj 


(C-14c) 


BFM 0 = 0 (C-14d) 

For any pair of average residual vectors, the Bhattacharyya Distance, 
which measures the distance between two multivariate probability distributions 
of specified mean and covariance, can be computed: 

1 _ _ _ 1 

B ij = “ ( v i " v j) T [“ ( c vi + C vj )]-1 (Vi - vj) 

o 2 


+ - £n 
2 


det[- (C v 4C V )] 
2 i j 


J det(C v )*det(C v .) 
If 1 J i 


(C-15a) 


(Note that, in this case, C v = C v =C V , so the second term is always zero.) 

1 j 

Furthermore, the Bhattacharyya correlation between two sensor failure hypothe- 
ses, and lower and upper bounds on the probability of error in distinguishing 
between two hypotheses are computed as follows: 

Pi j = e~ B ij (C-15b) 


LBi 


j " \ ( 1_ y' 1 - Pij 2) 


(C-15c) 


p ij 

U Bij = — 


(C-15d) 


The structure of subroutine calls in program PARCEV is shown in Fig. C-3; 
suboutines are described in Table C-l. The input file has precisely the same 
format as the primary output file for DESIGN, so that it may be run immediate- 
ly afterwards, if desired. The output file for program PARCEV contains an 
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Figure C-3. Subroutine Calls Made by Program PARCEV, 

Excluding Library Routines 

echo of the input file (i.e., the DESIGN output file), and the Bhattacharyya 
Distances and correlations and lower and upper bounds on the probability of 
error, for each pair of sensor bias failure hypotheses. 

C.3 PROGRAM KFRES 

Program KFRES evaluates the ability of a Kalman filter to distinguish 
sensor failures in much the same way that PARCEV evaluates a set of parity 
vectors for their ability to distinguish failures. It begins by computing, 
for eacli set of perturbed system matrices A£, B£, C£ and D£ (for fc=l , . . . ,NL) , 
a set of composite matrices, formed from the perturbed and unperturbed 
matrices as follows: 
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D £ = [D £ - D] 


(C-16a) 


where xg, up and yg are the base point vectors and p e is the percent error for 
base point perturbation. From the base point errors, the base point deviation 
vectors are computed: 

«5x b a = [I ~ Ap] xg* - Bgqg* (C-18a) 

<$yg £ = yg £ “ Cg*g* - Dgug* (C-18b) 

Next, for each possible sensor bias failure (i.e., for j = 0,1,..., NO), 
the augmented state vector is computed from the composite perturbed matrices 
and the base point deviations: 

" I 1 ~ F *]- 1 

where: 

bj = BFMj * ej (C-19b) 

and is a unit vector in the jth direction. The augmented covariance matrix 
must be solved for in the following equation: 

c a* = ?£ C a £ F £ T + g £ Q G £ T + K R K T (C-20) 

From the augmented state and covariance, the Kalman filter mean (or residual) 
and autocovariance function are computed: 

vJ*’ = Hg x a ^ + Dg ug^ + 5yg ^ + bj (C-21a) 

— T _ T _ _T 

C v jA = H g [C a A (Fgi) ] Hg + (Dg Q Dg + R) 6 1o (C-21b) 

— — _ T _ T _ _ T _ 

+ 0>g Q [Fg 1-1 Gg] Hg + R [Fg 1 ' 1 K] HgT) 5 ln>n 40 


G gug 4 + K (b j + 6 yg *) + ( 63 1 

0 


(C-19a) 
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where i is the time lag (i.e., = cov [\4^(k) vJ ^(k+i)T] ). Then, the 

average residual and covariance over all perturbed models may be computed for 
each sensor failure ( j=0, 1 , . . . ,N0) : 



NL 

I 

£=1 


(C-22a) 



HU ^ 

l [Cyj^ + Vl A ] 

£=1 


vj( vj) T 


(C-22b) 


From these, the Bhattacharyya distances and correlations, and lower and upper 
bounds on the probability of error in distinguishing sensor failures, may be 
computed as in program PARCEV (Eq. C-15). 

The structure of subroutine calls in program KFRES is shown in Fig. C— 4 ; 
subroutines are described in Table 0-1. The inputs to KFRES are the inputs to 
DESIGN, with a few additions, shown in Fig. C-5. These include MAX_LAG, the 
time lag used to compute the innovations covariance (i in Eq. C-21b); the 
random number generation seed (KF_SEED) and percent error (PER ERR) used to 
compute the base point errors; the base point vectors (XB, UB, and YB); and 
the Kalman gain matrix (KG). The output file contains an echo of the input 
file, the average residual vector an covariance matrix for each possible 
sensor failure, and the Bhattacharyya distance and correlation and the lower 
and upper bounds on the probability of error in distinguishing failures, for 
each pair of possible sensor failures. 
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Figure C-4. Subroutine Calls Made by Program KFRES, 
Excluding Library Routines 


KFRES INPUTS 


MAX LAG 

0 

KF SEED, PER ERR 


700549 

XB(NS) , 
0.8900504 

0.0 

UB(NC) , YB(NO) 
0.7799951 

0.1162586 

8.6679444E-02 


0.5335010 

0.6000000 

-0.5000000 

0.6000000 

O.OOOOOOOOE+OO 

0.8900504 

0.7799951 

0.5165927 

0.3274180 

-0.8392528 

KG (NS, NO) 

1.487D-01 2.183D-01 

7.458D-02 

1.428D-01 

-2.716D-01 

7.242D-02 

1.061D-01 

3.635D-02 

6.955D-02 

-1.323D-02 

4.811D-03 

7.052D-03 

2.407D-03 

4.614D-03 

-8.793D-03 

5.883D-03 

8.641D-03 

2.950D-03 , 

5.643D-03 

-1.075D.02 


Figure C-5. 

Inputs to 

Program KFRES 
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C.4 PROGRAM SIMULATE 

Program SIMULATE simulates the states, controls and outputs (sensor mea- 
surements) of the dynamic system described by Eq. C-l, including, if desired, 
sensor bias failures in any or all sensors, each beginning at a specified 
time. Two types of residuals are generated at each time step: robust parity 

check residuals and Kalman filter residuals. In addition, the robust parity 
check residuals are used to compute WSSR and SPRT statistics, in order to 
detect, verify and isolate sensor failures. Before robust residual generation 
begins, each parity vector is divided into two parts: the first (NP+1)N0 
elements become Wy , which is to be multiplied by the outputs window vector; 
the last (NP+1 )NC elements become Wy , which is to be multiplied by the con- 
trols window vector. 

The dynamic system simulation used computes the deviations of the state 
( <5x) , controls ( 6u) and outputs ( 6y) from a base point (xb, ub, £b) which is 
consistent with the system matrices. The control and associated steady state 
deviations are initialized as 


6ui(0) = /IT N(0,1) 


(C-23a) 


6x(0) = 0 


(C-23b) 


and the deviations are simulated as in the dynamic system equations (Eq. C-l), 
that is: 


6x(k+l) = A 6x(k) + B 6u(k) 


(C-24a) 


6y(k) = C 6x(k) + D <5u(k) + v(k) 


(C-24b) 


6u(k) = n (k) 
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where the system matrices A, B, C and D, the white noise vectors v_ and ri, and 
the time increment At are as in Eq. 1. 

During simulation, the control and output deviation vectors are stored 
over time in window vectors, as follows: 


Ip(k) 


^y(k-NP) 

<5y(k-NP+l) 


(C-25a) 


6y(k) 


Up(k) = 


6u(k-NP) 

6u(k-NP+l) 


(C-25b) 


&(k) 


The robust residuals may be computed at any time step from the parity vectors 
and the window vectors as 

T T 

v i(k) = Wy^ Yp(k) + Wj» Up(k) • (C-26) 

Under ordinary conditions, these residuals should be close to zero; large 
residuals are indicative of sensor output failures or large modeling errors. 
These residuals are then used to compute statistics to detect and identify 
sensor failures, as described in the FDI equations of Appendix B. 

The Kalman filter residuals are computed by a simple Kalman filter, using 
the unperturbed system matrices, as follows: 
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v(k) 


6y(k) - 6y(k) 


(C-26a) 


where 


6y(k) 

= C 6x(k) + D <5u(k) 

(C-26b) 

<Sx(k) 

= A <Sx(k-l) + B<5u(k-1) 

(C-26c) 

6x(k) 

= 6 x(k) + K \<k) 

(C-26d) 

<5x(0) 

= 0 

(C-26e) 


and K is the Kalman gain matrix. 

The structure of subroutine calls in program SIMULATE is given in Fig. 
C-6; Table C-l gives a brief description of the purpose of each subroutine 
and function. The input and output files for SIMULATE are described below. 

The input file for SIMULATE has the same format as the primary output 
file for program DESIGN, with some additions, as shown in Fig. C-7. The first 
of these is TMAX, the maximum simulation time (simulation begins at time 
zero). The second input is PERTURB_FLAG , a logical flag (true or false) which 
indicates whether or not to perturb the system matrices used in Eq. C-24. 

Next is SIM_SEED, the initial integer value of the random number generator 
seed, for computing Gaussian noise. 

The next inputs specify sensor bias failures during simulation. First, 
N_FAILURE is read in, which is the number of bias failures to occur. Then, 
for each failure, the following are read in: FAIL_TIME, FAIL_BIAS and 

FAIL_SENSQR, which are, respectively, the time that the bias occurs, the 
amount of the bias, and the number of the sensor in which the bias occurs. 
Ordinarily, a bias should cause some of the residuals to become large. 
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Figure C-6. Subroutine Calls Made by Program SIMULATE, 

Excluding Library Routines 

beginning at the bias time. Following the bias failure inputs are the plot- 
ting inputs. NR_PLOT gives the number of robust parity check residuals whose 
simulated values should be stored at each time step in a data file, for later 
plotting; PLOT_INDEX specifies which residuals (NR_PLOT of them) should be 
stored in plot files; and KF_PLOT is a flag indicating whether or not to 
create plot files containing the Kalman filter residuals. Next is KFG, the 
Kalman gain matrix used in the Kalman filter (Eq. C-26). 

The final inputs to SIMULATE are used to compute the FDI statistics. 
First, for each WSSR trigger statistic (N_WSSR of them), several quantities 
are specified. TAU_WSSR is the time constant used in the low-pass filter for 
the statistic; THR WSSR is the threshold for the statistic, above which 
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Figure C-7. Inputs to Program SIMULATE 
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is indicated a possible failure; NWS_RES is the number of parity check resid- 
uals used in the statistic and WS_RESN specifies which ones; and WS_COV is the 
covariance matrix for the residuals used. Then, for each SPRT verify statis- 
tic (N_VERIF of them) and each SPRT isolate statistic (N_ISOL of them), the 
following are specified: THRJSPRT, NSP_RES, SP_RESN and SP_COV are analogous 

to their USSR counterparts; SP_SIG1 and SP_SIG2 are the expected values (sig- 
natures) of the residuals used, under two different failure hypotheses. (See 
Appendix B for a further explanation of the use of FDI statistics). 

The output file for program SIMULATE contains an echo of the input file, 
and an indication of when any FDI statistic passes its threshold, and when 
(if ever) a sensor failure is identified. In addition, plot files are created 
containing the parity check residuals, Kalman filter residuals, and WSSR and 
SPRT statistics. 
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